Generative AI Models: Overview

In the last few months, everything around Artificial Intelligence (AI) has been booming around the technology world. In this context, Generative AI seems to be taking a solid place as the “new best thing” in data-driven innovation.

So what is Generative AI? It’s a technology that uses Machine Learning (ML) and AI to enable software to create new images, text blocks, video, and audio or can be used for producing new ideas and plans. Generative AI can identify patterns in its input data and use them to generate similar outputs as requested. These outputs aren’t a copy but new and unique responses.

Generative AI has a huge amount of use cases in research, business, or even government applications. For this reason, it’s being adopted by organizations of all sizes and areas around the globe. Producing big amounts of useful data in short periods also helps to train new ML models. You could say Generative AI is leveraging its power to become an even more relevant technology in the future.

Not the same as Analytical Models

Machines are better than us at analyzing sets of data thanks to their enormous processing power. Initially, Artificial Intelligence was used exclusively to find patterns in data for tasks like detecting spam and doing predictions, among other analysis and cognitive works.

For example, when asking Google Maps to show you how to get from one point to another in a city, AI uses its analytical power to show you different routes you could use to get to your destination, using real-time traffic information to get you the best available route.

It’s a big jump from these kinds of tasks to creating a poem that is similar to the ones written by your favorite poet, for example. So what’s the difference? Creative work takes much more powerful training and resource consumption for AI Models. It also needed to develop analytical AI before being able to take its first steps.

Generative AI Took Time to Become an Important Tool

After years of developing “small” AI models that could classify information or understand language, compute power used by these models started growing exponentially in the second part of the 2010 decade. At that moment, the ability to process information by Artificial Intelligence expanded too and the first “Generative” models appeared for cases like writing a joke or writing a few lines of code.

Cheaper processing and scaling made Artificial Intelligence better and faster to use. Last year (2022), things really started picking up with Generative AI and the future looks very promising. As you can see in the image above, the generation of text, images, videos, and code by Artificial Intelligence is being developed and used right now in multiple lines of business and research. Maybe this text was generated by AI too? 😉

What Else is in Store for the Future?

Billions of dollars have been poured into the research and development of Generative Artificial Intelligence Solutions in the last few years. According to Gartner, applications for Generative AI will take over the world shortly. Some predictions and use cases being developed right now:

Generative AI is being used for Drug Design: this should reduce the cost for pharmaceuticals to produce a new drug and also the time it takes to do this. Experts at Gartner expect 30% of drugs to be “discovered” using Generative AI techniques by 2025.
New materials design: focused on specific physical properties, new materials are bound to be created using Generative AI to help and speed up processes in industries such as aerospace, medical, defense, electronics, and energy, among others.
New parts design: these same industries could use AI to help in the design of new parts that are more efficient or perform better.
Optimization of component placement in semiconductor chip design: it can also be done by leveraging the power of Generative AI. This would reduce product development life cycle time.
Creation of synthetic data: Generative AI can create data instead of gathering it from the real world. For example, healthcare data that does not expose patient identities. This data would help to train other ML models.
Inbound marketing content: in the marketing area, it’s expected that outbound marketing messages will start to be synthetically generated. They expect the number to go up to around 30% by 2025, from only 2% in 2022.
Blockbuster films will use AI-generated video: one of the boldest predictions. It’s expected that AI will generate at least 90% of the length of a movie by 2030. This would require plenty of advance in this area in the next few years, but it’s certainly possible.

ChatGPT and Other Main Players

ChatGPT

ChatGPT is an AI Language model developed by OpenAI. It’s a chatbot in essence, but it’s powered by a gigantic amount of data and parameters. Thanks to that, it can answer questions, write stories, do reviews of text/code, and more. It’s used in a conversational context.

It can’t be denied that ChatGPT created the AI fever around the technology world. Large amounts of money have been pledged to AI development following its launch. So, even if it’s buried in the future by other Artificial Intelligence that do the same kind of tasks better than ChatGPT, it’ll still be remembered as one of the pioneers of the industry.

Dall-E

OpenAI is also behind the creation of Dall-E and Dall-E 2. These two are Deep Learning models used for image creation. The idea behind them is to create images that accurately reflect a prompt that was given to it in natural language. Dall-E will create an original and realistic image combining concepts, attributes, and styles. It can make realistic edits to existing images if told to, by adding or removing elements or creating variations based on the original. This can be used to create a collection of unique images that are similar in concept. Below you can see an original picture (left) and an image created by Dall-E 2 (right), expanding the original one.

Google’s LaMDA is a family of conversational neural language models. Google has been working on these models during this decade and LaMDA 2 got very notorious in 2022 when a senior software engineer at Google claimed that LaMDA had become sentient. This claim was disregarded by a big part of the community.

Additionally, Google announced Bard in February of 2023. Bard is a chatbot powered by LaMDA, a direct response to Open AI’s ChatGPT popularity. It’s expected that Bard will be integrated into Google Search in the future.

Stable Diffusion

Another text-to-image model was released in 2022. It was developed by the CompVis Group at Ludwig Maximilian University in Munich. It uses a diffusion model and has been trained with more than 5 billion image-text pairs. Its code has been released publicly and it can be run on hardware that is not very specialized but requires a modest Graphic Processing Unit and VRAM.