Generative artificial intelligence: progress and the future

by time news

2023-09-13 08:30:00

Over the last decade, significant advances have been made in the field of Artificial Intelligence (AI) and AI has become more ubiquitous in our daily lives. The widespread use and adoption of AI can be attributed to multiple factors, including deep learning (DL), also known as modern artificial neural networks, the availability of large volumes of data, and the computing power to train DL models. More recently, generative AI has caught the attention of the general public, thanks to OpenAI and the building of high-performance, scalable large language models (LLMs). Generative AI has also been used to produce text, images, videos, programming code, and music. There are multimodal models that generate images based on text descriptions (for example, DALLĀ·E) and vice versa and such innovations will continue to grow quite rapidly.

Advances in generative AI

During 2012, significant progress was demonstrated in the application of a DL model [1] to classify images into several different groups (ImageNet Large Scale Visual Recognition Challenge 2010). This was followed by the use of DL for similar classification tasks in text and speech, where the DL models significantly improved on previously established benchmarks. These models were trained for specialized tasks and offered state-of-the-art performance. Using DL to generate a wide range of results has attracted AI researchers. Generative Adversarial Networks [2], the landmark work in this direction, was carried out during 2014 where real-looking images of human faces and numbers were generated. This led to further research to develop generative AI techniques in other domains.

Language modeling has been a challenging task for AI. The goal of language models is to predict the next word given a sequence of words. The use of DL for LLM pre-training was demonstrated in 2019 [3]. Pre-trained Generative Transformers (GPT) are the underlying technology that powers ChatGPT. These models have been trained with a large volume of text data spending enormous computing power on graphics processing units (GPUs). The results of GPT-3/GPT-4 for tasks such as text summarization, question answering, and code generation have been impressive.

Challenges for generative AI models

DL models learn from training data and set the parameters of artificial neural networks to represent the worldview represented in the data. These models are generally many orders of magnitude larger than traditional machine learning (ML) models. The size of these networks and models can become a challenge when the amount of data available for training is small. Most real-world data sets have class imbalance and may have inherent (non-obvious) bias. Techniques for training DL models have regularly been developed to overcome these challenges. Otherwise, they are prone to memorizing the training data, also known as overfitting and models may not be able to generalize unseen data or provide biased results.

Generative AI models are also prone to challenges inherent to DL techniques. Additionally, the generative nature of the models can introduce artifacts into the generated data. For example, AI imagers have difficulty with their hands. They could produce strange looking images that are difficult to explain. Various approaches have been proposed to overcome these challenges. [4]. This also applies to LLMs whose job is to predict the next word. They may complete errors or give incorrect answers, given the data they are trained on. Care must therefore be taken to ensure that security barriers are in place, particularly when responding to human queries.

Paving the way to innovative applications

The initial success of DL was demonstrated for specific tasks such as classification, where models were trained to be deep and narrow. In contrast, generative AI models tend to be broad and superficial. Initial applications of DL were designed to provide the highest accuracy demanded by business requirements, and AI researchers focused on improving these metrics. Generative AI has opened up possibilities for its use in creative fields such as fashion design, creative writing, and art generation. This will lead to wider use of AI in skill-intensive areas that have so far been untouched by it. Future research will be guided by how these social communities adapt to the use of AI and this may stimulate the growth of innovative applications.

Disclaimer: The views reflected in this article are those of the author and do not necessarily reflect the views of the global organization EY or its member firms.

References

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton: ImageNet classification with deep convolutional neural networks. NIPS 2012: 1106-1114. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, Yoshua Bengio: Generative adversarial networks. NIPS 2014: 2672-2680. Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: BERT: Bidirectional Deep Transformer Pretraining for Language Understanding. NAACL-HLT (1) 2019: 4171-4186. Makkapati, V., Patro, A. (2017). Improving symmetry in GAN-generated fashion images. In: Bramer, M., Petridis, M. (eds) Artificial intelligence XXXIV. SGAI 2017. Lecture Notes in Computer Science, vol 10630. Springer, Cham.

#Generative #artificial #intelligence #progress #future

You may also like

Leave a Comment