AI is a rapidly developing technology that introduces new and often unfamiliar terminology. This post aims to clarify some of the key concepts in AI.
-
What is generative AI?
- AI: Artificial Intelligence technology that can produce: text, audio, image or synthetic data.
- GenAI: generative Artificial Intelligence that creates new content based on what it has learned from existing content.
- AI agent: a computer system that can reason, learn and act autonomously (like humans).
- Machine Learning: a subfield of AI trains to create a model from input data. The model then can produce a data that is new and never seen in the input data. This means a machine can learn and take action based on a model, not a traditional program.
1.1 How many types of ML? There are 4 types: Supervised, Unsupervised, Reinforcement(not mentioned here) & Deep Learning.
- Supervised (labeled data): The data is tagged or differentiated seperately by a name or a type. (Example: pizza clients behaviors, who tips at the store will be points “o”, who tips at delivery will be points “x”). In supervised learning, the model is to predict the future values.
- Unsupervised (Unlabeled data): The data is the same type (Example: employee of a company: income vs job tenure). Data is grouped to see whether someone is on a fast track. Unsupervised Learning is all about discovery.
- Deep learning (semi-supervised learning): uses the neuron networks, which is inspired by the human brain, including many layers of connecting nodes or neurons. Deep learning is also called semi-supervised learning because there is a small amount of input data that is labeled to show the concepts of the data while the unlabeled data is called “black-box” that helps to make predictions.
GenAI and Large Language Models (LLMs) are subsets of Deep Learning that uses artificial neuron networks.
1.2. How many types of Deep Learning models: 2 types, Discriminative & Generative models.
-
Discriminative Model: used to classify labels or predict new labels for data points. (discriminate between different kinds of data instances. For example: predict whether an input image is a dog or a cat.)
-
Generative AI Model (Gen AI model): can generate new data instances or new contents based on a learned probability distribution of existing data. (For example: generate an image of a dog, a new content of a language or a piece of audio)
-
How generative AI works?
-
Traditionally to distinguish a cat or a dog, we need a hard code: type animal, how many legs, ears, how is the fur, what he likes, dislikes…
-
With a wave of neuron network, we could give the network pictures of cats and dogs, then ask “is it a cat?”, and it would predict or answer “yes, it is” or “No, it’s not”. It is very great that “we” here could be anyone of normal persons who can speak to or type a common language into a prompt. Some examples: model PaLM (Pathways Language Model), model LaMDA (Language Model for Dialogue Applications)
-
Mathematically a model looks like this: $y = f(x)$ with $x$: input data, $y$: output data, $f$: model.
-
Training: the process of learning from the existing content and results in the creation of a statistical model. Particularly, the model learns the pattern of the existing data, so given some text, it can predict the next words. For ex: “I usually do morning exercise at ___”, the model could predict “7am” as a next word due to its highest probability.
-
When given a prompt (an input content, like a question), Gen AI will use the statistical model to predict or generate a new content, that is similar to the data it was trained on.
-
The power of Gen AI comes from the use of transformers, which produced a 2018 revolution in NLP (natural language processing).
- Sometimes Transformers produce issues called “Hallucinations”, which are meaningless or grammatically incorrect. Reasons can be:
- The trained data is not enough.
- Data quality is too dirty.
- Contexts are not enough.
- Constraints are not enough.
-
-
Describe generative AI model types?
-
Generative Image Models: input is an image, the output can be text, image, video.
-
Generative Language Models: input is a text, the output can be text, image, audio or decisions.
-
Text-to-text models: take a text input and produce text output. These models are trained to learn the mapping a pair of text, for example, translating from one language to others.
-
Text-to-image models: take a large set of images, each tagged with a short text description. Diffusion is one method to achieve this. Additionally, we also have text-to-video or text-to-3D models.
-
Text-to-task models: perform a task or action based on text input request, which can be a wide range of actions, such as answering a question, performing a search, making a prediction.
-
Foundation models: a large AI model pre-trained on a vast quantity of data designed to be adapted or fine-tuned to a wide range of downstream tasks, such as sentiment analysis, image captioning, object recognition. Foundation models have the potential to revolutionize many industries, including healthcare, finance, and customer service.
-
Vertex AI offers a Model Garden of foundation models. The language foundation models include “PaLM API for chat”, “PaLM API for text” and “BERT”.
-
The vision foundation models could be “Stable Diffusion”, which has been shown to be effective at generating high quality images from text input.
-
-
-
Describe generative AI applications?
4.1. Text: Marketing content, Sales by email, Email chatting Support, General writing, Note taking,…
4.2. Code: Code generation, Code documentation, Text to SQL, Web builders.
4.3. Image: Image generation, Image Advertising, Design.
4.4. Speech: Voice Synthesis.
4.5. Video: Video editting/generation.
4.6. 3D: 3D models/scene.
4.7. Other: Gaming, Robotic Automation, Music, Audio, Biology & Chemistry.
For example: The Google Gemini code generation model can help:
- Debug your code. - Explain your code line by line. - Craft SQL queries from you DB. - Translate code from one language to another. - Generate documentation and tutorials for your source code.
-
Vertex AI Studio for developers:
-
Quickly explore and customize Gen AI models that you can leverage in your applications on Google Cloud.
-
Help developers create and deploy AI models.
- A library of pre-trained models.
- Tool for fine-tuning.
- Tool for deploping models to production.
- Community forum for developers to share ideas and collaborate.
-
Remark: This post is based solely on Google documents, but I hope it helps highlight key terms used in the wide and magical world of Gen AI.