Introduction to Generative AI

Qwiklabs-Courses
15 Feb 202421:35

Summary

TLDR本视频课程介绍了生成性人工智能(Generative AI)的基础知识,包括定义、工作原理、模型类型和应用场景。课程解释了AI、机器学习和深度学习之间的关系,并探讨了生成性模型与判别性模型的区别。通过实例讲解了如何使用不同类型的生成性AI模型来生成文本、图像、音频等内容,以及如何通过提示设计来引导模型输出。此外,还介绍了Google Cloud提供的资源和工具,如Vertex AI Studio、Vertex AI Search and Conversation以及PaLM API,帮助开发者利用生成性AI解决实际问题。

Takeaways

  • 📚 生成式AI是一种可以创造文本、图像、音频和合成数据的人工智能技术。
  • 🤖 人工智能是计算机科学的一个分支,涉及创建能够自主推理、学习和行动的智能代理。
  • 🧠 机器学习是AI的一个子领域,通过输入数据训练模型,使其能够对新的数据做出有用的预测。
  • 🔍 监督学习模型使用带标签的数据,而无监督学习模型处理不带标签的数据。
  • 🌟 深度学习是机器学习的一种,使用人工神经网络处理更复杂的模式。
  • 🛠️ 生成式模型基于学习到的数据概率分布生成新数据实例,而判别式模型则用于分类或预测数据点的标签。
  • 📈 生成式AI是深度学习的子集,使用人工神经网络,可以处理标记和未标记数据,并使用监督、无监督和半监督方法。
  • 📝 生成式AI通过训练学习现有内容,创建统计模型,并在给定提示时生成新内容。
  • 🔑 提示是给LLM的一小段文本输入,可以用来以多种方式控制模型的输出。
  • 🏗️ 基础模型是在大量数据上预训练的大型AI模型,旨在适应(或微调)广泛的下游任务。
  • 🌐 Vertex AI提供模型花园,其中包括用于聊天和文本的基础模型PaLM API和用于图像生成的稳定扩散模型。

Q & A

  • 生成式AI是什么?

    -生成式AI是一种人工智能技术,能够生成各种类型的内容,包括文本、图像、音频和合成数据。

  • 人工智能和机器学习有什么区别?

    -人工智能是计算机科学的一个分支,涉及创建能够自主推理、学习和行动的智能代理。机器学习是AI的一个子领域,是一种通过输入数据训练模型的程序或系统。

  • 监督学习和无监督学习的主要区别是什么?

    -监督学习模型使用带有标签的数据,而无监督学习模型使用没有标签的数据。监督学习通过过去的示例来预测未来的值,而无监督学习关注于发现数据是否自然分组。

  • 深度学习在机器学习方法中扮演什么角色?

    -深度学习是机器学习的一种类型,使用人工神经网络处理比传统机器学习更复杂的模式。人工神经网络受到人脑的启发,由许多相互连接的节点或神经元组成,可以通过处理数据和做出预测来学习执行任务。

  • 生成式模型和判别式模型有何不同?

    -判别式模型用于对数据点进行分类或预测标签,而生成式模型基于学习到的现有数据的概率分布生成新数据实例。

  • 如何判断一个输出是否是生成式AI的产物?

    -如果输出是自然语言、音频或图像等,那么它是生成式AI的产物。如果输出是一个数字、类别或概率,那么它不是生成式AI的产物。

  • 生成式AI的过程与传统机器学习过程有何不同?

    -传统机器学习过程使用训练代码和标记数据来构建模型,而生成式AI过程可以使用训练代码、标记数据和未标记数据来构建一个基础模型,该模型能够生成新内容,如文本、代码、图像、音频、视频等。

  • 什么是Transformers模型?

    -Transformers模型由编码器和解码器组成,编码器编码输入序列并将其传递给解码器,解码器学习如何解码表示以完成相关任务。Transformers在2018年引发了自然语言处理的革命。

  • 什么是prompt设计?

    -Prompt设计是创建一个提示的过程,该提示将从大型语言模型(LLM)生成期望的输出。

  • 基础模型是什么,它们如何帮助行业?

    -基础模型是大型AI模型,它们在大量数据上进行预训练,旨在适应(或微调)广泛的下游任务,如情感分析、图像描述和对象识别。基础模型有潜力革新许多行业,包括医疗保健、金融和客户服务。

  • Google Cloud如何帮助用户更多地利用生成式AI?

    -Google Cloud通过Vertex AI Studio、Vertex AI Search and Conversation以及PaLM API等工具,帮助用户探索、定制和部署生成式AI模型,即使没有编码或机器学习经验也能构建聊天机器人、数字助手、自定义搜索引擎等。

  • Gemini模型的特点是什么?

    -Gemini模型是一种多模态AI模型,它不仅能理解文本,还能分析图像、理解音频的细微差别,甚至解释编程代码。这使得Gemini能够执行以前对AI来说不可能的复杂任务。

Outlines

00:00

📚 介绍生成式人工智能

本段落介绍了生成式人工智能(Generative AI)的基本概念和应用领域。首先,通过定义生成式AI,解释了其作为人工智能技术的一个分支,能够生成包括文本、图像、音频和合成数据在内的各种类型的内容。接着,对人工智能(AI)进行了简要的背景介绍,区分了AI与机器学习(ML)的不同,并解释了AI是计算机科学的一个分支,涉及创建能够自主推理、学习和行动的智能代理。此外,还探讨了机器学习的两个主要类别:监督学习和无监督学习,并通过具体的例子说明了它们在实际问题中的应用。最后,引入了深度学习作为机器学习方法的一个子集,并强调了其在生成式AI中的重要性。

05:03

🧠 生成式与判别式模型

这一部分深入探讨了生成式AI在深度学习领域的位置,并解释了生成式模型与判别式模型的区别。生成式模型能够基于学习到的概率分布生成新的数据实例,而判别式模型则用于对数据点进行分类或预测标签。通过具体的例子,如狗和猫的图像分类,说明了两种模型的工作原理。此外,还讨论了生成式AI的识别方法,指出当输出是自然语言、音频或图像时,可以认为是生成式AI的应用。最后,通过数学视角,进一步阐述了生成式AI与传统机器学习过程的不同之处。

10:05

🌐 语言模型与内容生成

本段落聚焦于生成式AI在语言模型和内容生成方面的应用。首先,介绍了大型语言模型如何通过从互联网上摄取大量数据来构建基础语言模型,并通过提问来获取相关信息。接着,给出了生成式AI的正式定义,强调其通过学习现有内容来创造新内容的能力。此外,还讨论了不同类型的生成式AI模型,如生成式图像模型、生成式语言模型以及它们在文本、图像、音频和决策等方面的输出能力。最后,探讨了生成式语言模型作为模式匹配系统的工作方式,并通过具体例子展示了其在文本生成方面的强大功能。

15:08

🛠️ 应用与模型类型

这一部分详细介绍了生成式AI的各种应用和模型类型。首先,介绍了文本到文本、文本到图像、文本到视频和文本到3D等不同类型的模型,并解释了它们如何根据文本输入生成相应的输出。然后,讨论了文本到任务模型,它们能够根据文本输入执行定义好的任务或动作。接着,提出了基础模型的概念,这些大型AI模型在大量数据上预训练,并能够适应广泛的下游任务。此外,还提到了Google Cloud提供的Vertex AI和PaLM API等工具,以及它们如何帮助用户更有效地利用生成式AI。最后,通过代码生成的例子,展示了生成式AI在解决编程问题方面的潜力。

20:11

🌟 Google Cloud与生成式AI

本段落主要介绍了Google Cloud如何帮助用户更好地利用生成式AI。首先,介绍了Vertex AI Studio,这是一个允许用户快速探索和自定义生成式AI模型的工具,它提供了丰富的资源和工具,使开发者能够轻松入门。然后,提到了Vertex AI Search and Conversation,这是一个无需编码经验即可创建聊天机器人和数字助手的工具。最后,介绍了PaLM API,它允许开发者测试和实验Google的大型语言模型和生成式AI工具。此外,还提到了Model Garden,这是一个包含各种基础模型的库,这些模型可以用于情感分析、图像字幕、对象识别等多种任务。

Mindmap

Keywords

💡生成式人工智能

生成式人工智能(Generative AI)是一种能够基于已有内容生成新内容的人工智能技术,包括文本、图像、音频和合成数据等。在视频中,生成式AI是核心概念,它通过学习现有数据的模式和结构,生成新的、类似风格的样本。例如,生成式语言模型可以根据训练数据生成自然语言文本。

💡人工智能

人工智能(AI)是计算机科学的一个分支,涉及创建能够自主推理、学习和行动的智能代理系统。视频中提到,AI的目标是构建能够像人类一样思考和行动的机器,它包括理论和方法,使得机器能够模拟人类智能。

💡机器学习

机器学习是AI的一个子领域,指的是通过输入数据训练模型,使其能够对新的、未见过的数据做出有用的预测。在视频中,机器学习模型分为监督学习和非监督学习两种类型,它们通过不同方式处理数据,以实现预测和分类任务。

💡深度学习

深度学习是机器学习的一种技术,使用人工神经网络处理更复杂的模式。视频中提到,人工神经网络受到人脑的启发,由许多相互连接的节点或神经元组成,能够通过处理数据和做出预测来学习执行任务。深度学习模型通常具有多层神经元,使其能够学习比传统机器学习模型更复杂的模式。

💡生成模型

生成模型是一类能够基于学习到的概率分布生成新数据实例的模型。视频中解释说,生成模型不同于判别模型,后者用于对数据点进行分类或预测标签,而生成模型则生成新内容。例如,生成模型可以学习数据的联合概率分布,并预测出新的数据实例,如生成一张狗的图片。

💡判别模型

判别模型是用于对数据点进行分类或预测标签的模型。它们通常在标记数据集上进行训练,并学习数据点特征与标签之间的关系。视频中提到,一旦判别模型被训练,它就可以用来预测新数据点的标签。

💡半监督学习

半监督学习是一种结合了少量标记数据和大量未标记数据的神经网络训练方法。在视频中,半监督学习帮助神经网络学习任务的基本概念,同时通过大量未标记数据帮助网络泛化到新的例子。这种方法使得模型能够在有限的标记数据情况下,仍能有效学习和预测。

💡变换器(Transformers)

变换器是一种深度学习模型,它在自然语言处理领域引起了革命性的变化。视频中提到,变换器模型由编码器和解码器组成,编码器对输入序列进行编码,然后将编码后的表示传递给解码器,解码器学习如何针对相关任务解码这些表示。变换器在处理语言数据时非常有效,但也可能出现幻觉(hallucinations),即生成无意义或语法不正确的词语或短语。

💡幻觉(Hallucinations)

幻觉是指模型生成的无意义或语法不正确的词语或短语。在视频中,幻觉可能是由多种因素引起的,例如模型未在足够多的数据上进行训练、训练数据质量差、模型未获得足够的上下文或约束。幻觉可能会使输出文本难以理解,也可能导致模型生成错误或误导性信息。

💡提示(Prompts)

提示是提供给大型语言模型(LLM)的一小段文本输入,用于以多种方式控制模型的输出。在视频中,提示设计是指创建能够从大型语言模型生成期望输出的提示的过程。通过设计有效的提示,用户可以引导生成式AI产生特定类型的响应或内容。

💡基础模型(Foundation Models)

基础模型是大型AI模型,它们在大量数据上进行预训练,并被设计成能够适应(或微调)广泛的下游任务,如情感分析、图像描述和对象识别等。视频中提到,基础模型有潜力革新多个行业,包括医疗保健、金融和客户服务等。

💡代码生成

代码生成是指使用生成式AI技术自动创建编程代码的过程。在视频中,展示了如何通过输入代码文件转换问题,使用Gemini模型生成所需的步骤和代码片段。这种方法可以帮助开发者调试源代码、解释代码、编写SQL查询、将代码从一种语言翻译成另一种语言,以及生成源代码的文档和教程。

Highlights

生成性人工智能是一种可以创造各种类型内容的人工智能技术,包括文本、图像、音频和合成数据。

AI是计算机科学的一个分支,涉及创建能够推理、学习和自主行动的智能代理。

机器学习是AI的一个子领域,是一种通过输入数据训练模型的程序或系统。

监督学习模型使用带有标签的数据,而无监督学习模型使用没有标签的数据。

深度学习是使用人工神经网络的机器学习类型,能够处理比传统机器学习更复杂的模式。

生成性AI是深度学习的子集,使用人工神经网络,可以处理标记和未标记数据,使用监督、无监督和半监督方法。

生成性模型基于学习到的概率分布生成新数据实例,而判别性模型用于对数据点进行分类或预测标签。

传统编程需要硬编码规则,而生成性AI可以从训练数据中学习模式,并生成新内容。

生成性AI是一种根据从现有内容中学到的信息创造新内容的人工智能。

生成性语言模型可以学习语言中的模式,并通过文本输入生成更多文本、图像、音频或决策。

Transformer模型由编码器和解码器组成,引发了2018年自然语言处理的革命。

提示是给LLM的一小段文本输入,可以用来以多种方式控制模型的输出。

生成性AI模型类型包括文本到文本、文本到图像、文本到视频和文本到任务等。

基础模型有潜力革新多个行业,包括医疗保健、金融和客户服务等。

Google Cloud提供的生成性AI工具和服务包括Vertex AI、PaLM API和Gemini等,可以帮助开发者和企业利用生成性AI。

Transcripts

play00:00

Hello, and welcome to “Introduction to Generative AI”.

play00:04

Don't know what that is?

play00:05

Then you're in the perfect place.

play00:08

In this course, I'll teach you 4 things.

play00:10

How to...

play00:11

Define Generative AI

play00:13

Explain how Generative AI works

play00:15

Describe Generative AI Model Types

play00:18

Describe Generative AI Applications

play00:21

But let's not get swept away with all that yet.

play00:24

Let's start by defining what Generative AI is first.

play00:28

Generative AI has become a buzzword but what is it?

play00:32

Generative AI is a type of artificial intelligence technology that can produce various types

play00:37

of content- including text, imagery, audio and synthetic data.

play00:43

But what is artificial intelligence?

play00:46

Since we are going to explore Generative Artificial Intelligence, let’s provide a bit of context.

play00:51

Two very common questions asked are:

play00:54

What is artificial intelligence and what is the difference between AI and machine learning?

play00:59

Let's get into it.

play01:01

So one way to think about it is that AI is a discipline, like how physics is a discipline

play01:07

of science.

play01:08

AI is a branch of computer science that deals with the creation of intelligent agents, and

play01:13

are systems that can reason, learn, and act autonomously.

play01:17

Are you with me so far?

play01:19

Essentially, AI has to do with the theory and methods to build machines that think and

play01:24

act like humans.

play01:26

Pretty simple right?

play01:28

Now let's talk about machine learning.

play01:30

Machine learning, is a subfield of AI.

play01:33

It is a program or system that trains a model from input data.

play01:37

The trained model can make useful predictions from new (never-before-seen) data drawn from

play01:43

the same one used to train the model.

play01:45

This means that Machine Learning gives the computer the ability to learn without explicit

play01:50

programming.

play01:52

So what do these Machine Learning models look like?

play01:55

Two of the most common classes of machine learning models are unsupervised and supervised

play02:01

ML models.

play02:03

The key difference between the two is that with supervised models, we have labels.

play02:08

Labeled data is data that comes with a tag, like a name, a type, or a number.

play02:14

Unlabeled data is data that comes with no tag.

play02:17

So what can you do with Supervised and unsupervised models?

play02:22

This graph is an example of the sort of problem that a supervised model might try to solve.

play02:27

For example, let’s say you are the owner of a restaurant, what type of food do they

play02:32

serve?

play02:33

Let's say pizza, wait dumplings, no pizza, I love pizza.

play02:37

Anyway....

play02:39

You have historical data of the bill amount and how much different people tipped based

play02:42

on order type - pick-up or delivery.

play02:45

In Supervised Learning, the model learns from past examples to predict future values.

play02:50

Here, the model uses the total bill amount data to predict the future tip amount (based

play02:56

on whether an order was picked-up or delivered).

play02:59

Also guys... tip your delivery drivers, they work hard.

play03:03

This is an example of the sort of problem that an unsupervised model might try to solve.

play03:08

Here, you want to look at tenure and income, and then group or cluster employees, to see

play03:13

whether someone is on the fast track.

play03:15

Nice work blue shirt.

play03:17

Unsupervised problems are all about discovery, about looking at the raw data, and seeing

play03:21

if it naturally falls into groups.

play03:24

This is a good start but let's go a little deeper to show this difference graphically

play03:28

because understanding these concepts is the foundation for your understanding Generative

play03:33

AI.

play03:35

In supervised learning, testing data values (“x”) are input into the model.

play03:39

The model outputs a prediction and compares it to the training data used to train the

play03:40

model.

play03:41

If the predicted test data values and actual training data values are far apart, that is

play03:44

called error.

play03:46

The model tries to reduce this error until the predicted and actual values are closer

play03:51

together.

play03:52

This is a classic optimization problem.

play03:55

So let's check in.

play03:56

So far we've explored the differences between artificial intelligence and machine learning

play04:01

and supervised and unsupervised learning.

play04:04

That's a good start!

play04:05

But what's next?.

play04:07

Let's briefly explore where deep learning fits as a subset of ML methods.

play04:11

And then I promise we'll start talking about GenAI.

play04:15

While machine learning is a broad field that encompasses many different techniques, deep

play04:19

learning is a type of machine learning that uses artificial neural networks, allowing

play04:23

them to process more complex patterns than machine learning.

play04:28

Artificial neural networks are inspired by the human brain.

play04:30

Pretty cool huh?

play04:32

Like your brain, they are made up of many interconnected nodes, or neurons, that can

play04:37

learn to perform tasks by processing data and making predictions.

play04:41

Deep learning models typically have many layers of neurons, which allows them to learn more

play04:46

complex patterns than traditional machine learning models.

play04:50

Neural networks can use both labeled and unlabeled data.

play04:55

This is called semi-supervised learning.

play04:57

In semi-supervised learning, a neural network is trained on a small amount of labeled data

play05:02

and a large amount of unlabeled data.

play05:05

The labeled data helps the neural network to learn the basic concepts of the task, while

play05:10

the unlabeled data helps the neural network to generalize to new examples.

play05:15

Now we finally get to where Generative AI fits into this AI Discipline!

play05:20

Gen AI is a subset of deep learning, which means it uses Artificial Neural Networks,

play05:25

can process both labeled and unlabeled data, using supervised, unsupervised, and semi-supervised

play05:31

methods.

play05:32

LLMs are also a subset of Deep Learning.

play05:35

See?

play05:36

I told you I'd bring it all back to GenAI.

play05:39

Good job me.

play05:40

Deep learning models (or machine learning models in general) can be divided into two

play05:44

types – generative and discriminative.

play05:48

A discriminative model is a type of model that is used to classify or predict labels

play05:52

for data points.

play05:54

Discriminative models are typically trained on a dataset of labeled data points, and they

play05:59

learn the relationship between the features of the data points and the labels.

play06:03

Once a discriminative model is trained, it can be used to predict the label for new data

play06:08

points.

play06:09

A generative model generates new data instances based on a learned probability distribution

play06:14

of existing data.

play06:16

Generative models generate new content.

play06:19

Take this example.

play06:20

Here, the discriminative model ‌learns the conditional probability distribution or the

play06:25

probability of “y” (our output) given “x” (our input), that this is a dog and

play06:30

classifies it as a dog and not a cat, which is great because I'm allergic to cats.

play06:35

The generative model ‌learns the joint probability distribution (or the probability of x and

play06:41

y) p(x,y) and predicts the conditional probability that this is a dog and can then generate a

play06:47

picture of a dog.

play06:48

Good boy, I'm going to name him Fred.

play06:51

To summarize:

play06:53

Generative models can generate new data instances and

play06:55

Discriminative models discriminate between different kinds of data instances.

play07:01

One more quick example:

play07:03

The top image shows a traditional machine learning model which attempts to learn the

play07:07

relationship between the data and the label (or what you want to predict).

play07:11

The bottom image shows a Generative AI Model which attempts to learn patterns on content

play07:17

so that it can generate new content.

play07:19

So what if someone challenges you to a game of "Is it GenAI or not?"

play07:23

I've got your back.

play07:25

This illustration shows a good way to distinguish between what is GenAI and what is not.

play07:31

It is NOT GenAI when the output (or y” or label) is a number, or a class (for example

play07:38

- Spam or not spam), or a probability.

play07:41

It IS GenAI when the output is natural language (like speech or text), audio, or an image

play07:48

like Fred from before, for example.

play07:51

Let's get a little mathy to really show the difference.

play07:54

Visualizing this mathematically would look like this.

play07:57

If you haven't seen this for awhile, the Y=f(x) equation calculates the dependent output of

play08:03

a process given different inputs.

play08:05

The “Y” stands for the model output,, the “f” embodies the function used in

play08:09

the calculation (or model), and the “X” represents the input or inputs used for the

play08:15

formula.

play08:16

As a reminder, inputs are the data like comma separated value files, text files, audio files

play08:22

or image files like Fred So, the model output is a function of all the inputs.

play08:29

If the “y” is a number - like predicted sales - it is not Generative AI.

play08:34

If “y” is a sentence like “Define sales”, it is generative, as the question would elicit

play08:40

a text response.

play08:41

The response would be based on all the massive large data the model was already trained on.

play08:47

So, the traditional ML Supervised Learning process takes training code and labeled data

play08:52

to build a model.

play08:54

Depending on the use case or problem, the model can give you a prediction, classify

play08:58

something, or cluster something.

play09:01

Now let's check out how much more robust the Generative AI process is in comparison.

play09:06

The Generative AI process can take training code, labeled data, and unlabeled data of

play09:12

all data types and build a “foundation model”.

play09:15

The foundation model can then generate new content.

play09:18

It can generate text, code, images, audio, video, and more.

play09:24

We've come a long way from traditional programming, to neural networks, to generative models!

play09:30

In traditional programming, we used to have to hard code the rules for distinguishing

play09:34

a cat - type: animal,

play09:36

legs: 4, ears: 2,

play09:39

fur: yes, likes: yarn, catnip.

play09:42

dislikes: Fred

play09:45

In the wave of neural networks, we could give the network pictures of cats and dogs and

play09:49

ask - “Is this a cat” - and it would predict a cat.... or not a cat.

play09:54

What's really cool is that in the generative wave, we - as users - can generate our own

play09:59

content - whether it be text, images, audio, video, or more.

play10:04

For example, models like PaLM (or Pathways Language Model or LaMDA (or Language Model

play10:11

for Dialogue Applications) ingest very, very large data from multiple sources across the

play10:17

Internet and build foundation language models we can use simply by asking a question - whether

play10:22

typing it into a prompt or verbally talking into the prompt itself.

play10:26

So, when you ask it “what’s a cat”, it can give you everything it has learned

play10:30

about a cat.

play10:31

Now let's make things a little more formal with an official definition.

play10:35

What is Generative AI?

play10:37

GenAI is a type of Artificial Intelligence that creates new content based on what it

play10:42

has learned from existing content.

play10:44

The process of learning from existing content is called training and results in the creation

play10:50

of a statistical model.

play10:52

When given a prompt, GenAI uses this statistical model to predict what an expected response

play10:57

might be–and this generates new content.

play11:00

It learns the underlying structure of the data and can then generate new samples that

play11:04

are similar to the data it was trained on.

play11:07

Like I mentioned earlier, a generative language model can take what it has learned from the

play11:11

examples it’s been shown and create something entirely new based on that information.

play11:17

That's why we use the word “generative!”

play11:19

But, large language models, which generate novel combinations of text in the form of

play11:24

natural-sounding language, are only one type of generative AI .

play11:28

A generative image model takes an image as input and can output text, another image,

play11:33

or video.

play11:34

For example, under the output text, you can get visual question and answering, while under

play11:39

output image, an image completion is generated, and under output video, animation is generated.

play11:46

A generative language model takes text as input and can output more text, an image,

play11:51

audio, or decisions.

play11:53

For example, under the output text, question and answering is generated, and under output

play11:58

image a video is generated.

play12:01

I mentioned We've stated that generative language models learn about patterns in language through

play12:05

training data.

play12:06

Then, give some text, then predict what comes next.

play12:09

Thus, Generative language models are pattern-matching systems.

play12:13

They learn about patterns based on the data you provide.

play12:17

Check out this example:

play12:18

Based on things it’s learned from its training data, it offers predictions of how to complete

play12:23

this sentence.

play12:24

“I'm making a sandwich with peanut butter and

play12:26

__”Jelly.

play12:28

Pretty simple right?

play12:29

Here is the same example using Gemini, which is trained on a massive amount of text data,

play12:34

and is able to communicate and generate human-like text in response to a wide range of prompts

play12:39

and questions.

play12:40

See how detailed the response can be?

play12:42

Here is another example that's just a little more complicated that pb&j's.

play12:47

The meaning of life is:

play12:49

And even with a more ambiguous question Gemini gives you a contextual answer and then shows

play12:53

the highest probability response.

play12:56

The power of Generative AI comes from the use of Transformers.

play13:01

Transformers produced the 2018 revolution in Natural Language Processing.

play13:05

At a high-level, a Transformer model consists of an encoder and decoder.

play13:10

The encoder encodes the input sequence and passes it to the decoder, which learns how

play13:15

to decode the representations for a relevant task.

play13:18

Sometimes Transformers runs into issues though.

play13:21

In Transformers, Hallucinations are words or phrases that are generated by the model

play13:25

that are often nonsensical or grammatically incorrect.

play13:28

See, not great?

play13:31

Hallucinations can be caused by a number of factors, like when:

play13:34

The model is not trained on enough data,

play13:36

The model is trained on noisy or dirty data,

play13:39

The model is not given enough context, or

play13:42

The model is not given enough constraints.

play13:45

Hallucinations can be a problem for Transformers because they can make the output text difficult

play13:49

to understand.

play13:51

They can also make the model more likely to generate incorrect or misleading information.

play13:56

So put simply...

play13:57

hallucinations are bad.

play13:59

Let's pivot slightly and talk about prompts.

play14:02

A prompt is a short piece of text that is given to the LLM as input, and it can be used

play14:07

to control the output of the model in a variety of ways.

play14:11

Prompt design is the process of creating a prompt that will generate the desired output

play14:15

from a large language model (LLM).

play14:18

Like I mentioned earlier, Generative AI depends a lot on the training data that you have fed

play14:23

into it.

play14:24

It analyzes the patterns and structures of the input data, and thus “learns.”

play14:29

But with access to a browser based prompt, you the user can generate your own content.

play14:34

So let's talk a little bit about the models types available to us when text is our input,

play14:39

and how they can be helpful in solving problems, like never being able to understand my friends

play14:44

when they talk about soccer.

play14:46

The first is...

play14:47

Text-to-Text

play14:48

Text-to-text models take a natural language input and produce text output.

play14:53

These models are trained to learn the mapping between a pair of texts For example, translating

play14:58

from one language to others.

play15:00

Next we have Text-to-image

play15:03

Text-to-image models are trained on a large set of images, each captioned with a short

play15:08

text description.

play15:09

Diffusion is one method used to achieve this.

play15:13

There's also Text-to-video and Text-to-3D

play15:17

Text-to-video models aim to generate a video representation from text input.

play15:22

The input text can be anything from a single sentence to a full script, and the output

play15:26

is a video that corresponds to the input text.

play15:30

Similarly Text-to-3D models generate three-dimensional objects that correspond to a user’s text

play15:36

description (for use in games or other 3D worlds).

play15:40

and finally there's Text-to-task

play15:44

Text-to-task models are trained to perform a defined task or action based on text input.

play15:49

This task can be a wide range of actions such as answering a question, performing a search,

play15:55

making a prediction, or taking some sort of action.

play15:58

For example, a text-to-task model could be trained to navigate web UI or make changes

play16:03

to a doc through the GUI.

play16:06

See with these models I can actually understand why my friends are talking about when a game

play16:10

is on.

play16:11

Another model that's larger than those I mentioned is a foundation model, which is a large AI

play16:17

model pre-trained on a vast quantity of data "designed to be adapted” (or fine-tuned)

play16:21

to a wide range of downstream tasks, such as sentiment analysis, image captioning, and

play16:27

object recognition.

play16:29

Foundation models have the potential to revolutionize many industries, including healthcare, finance,

play16:35

and customer service.

play16:37

They can even be used to detect fraud and provide personalized customer support.

play16:42

If you're looking for foundation models, Vertex AI offers a Model Garden that includes Foundation

play16:48

Models.

play16:49

The language Foundation Models include PaLM API for Chat and Text.

play16:53

The Vision Foundation models includes stable diffusion, which has been shown to be effective

play16:58

at generating high-quality images from text descriptions.

play17:01

Let’s say you have a use case where you need to gather sentiments about how your customers

play17:06

are feeling about your product or service, you can use the classification task sentiment

play17:11

analysis task model.

play17:13

Same for vision tasks - if you need to perform occupancy analytics, there is a task specific

play17:18

model for your use case.

play17:22

So those are some examples of foundation models we can use, but can GenAI help with code for

play17:28

your apps?

play17:29

Absolutely.

play17:31

Shown here are Generative AI Applications.

play17:34

You can see there's quite a lot.

play17:36

Let’s look at an example of code generation shown in the second block under code at the

play17:40

top.

play17:42

In this example, I’ve input a code file conversion problem - converting from Python

play17:47

to JSON.

play17:48

I use Gemini and insert into the prompt box “I have a Pandas DataFrame with two columns

play17:55

– one with the filename and one with the hour in which it is generated.

play17:58

I am trying to convert it into a JSON file in the format shown on screen.

play18:03

Gemini returns the steps I need to do this – and the code snippet!

play18:07

And here my output is in a JSON format.

play18:10

Pretty cool huh?

play18:12

Well get ready, it gets even better.

play18:13

It gets better.

play18:14

I happen to be using Google’s free-browser based Jupyter Notebook and can simple export

play18:19

the Python code to Google’s Colab.

play18:22

So to summarize - Gemini Code Generation Can Help You….

play18:26

Debug your lines of source code.

play18:28

Explain your code to you line by line.

play18:30

Craft SQL queries for your database.

play18:33

Translate code from one language to another.

play18:36

Generate documentation and tutorials for source code.

play18:40

I'm going to tell you about three other ways Google Cloud can help you get more out of

play18:43

generative AI.

play18:45

The First is Vertex AI Studio Vertex AI Studio lets you quickly explore

play18:51

and customize generative AI models that you can leverage in your applications on Google

play18:56

Cloud.

play18:57

Vertex AI Studio helps developers create and deploy generative AI models by providing a

play19:03

variety of tools and resources that make it easy to get started.

play19:07

For example, there is a: Library of pre-trained models

play19:10

Tool for fine-tuning models Tool for deploying models to production

play19:15

Community forum for developers to share ideas and collaborate

play19:19

Next we have Vertex AI which is particularly helpful for all of you who don't have much

play19:24

coding experience...

play19:26

You can build generative AI search and conversations for customers and employees with Vertex AI

play19:31

Search and Conversation (formerly Gen AI App Builder).

play19:36

Build with little or no coding and no prior machine learning experience.

play19:40

Vertex AI can help you can create your own:

play19:43

Chatbots Digital assistants

play19:45

Custom search engines Knowledge bases

play19:48

Training applications And more

play19:51

And then we have PaLM API.

play19:54

PaLM API let’s you test and experiment with Google’s Large Language Models and Gen AI

play19:59

tools.

play20:00

To make prototyping quick and more accessible, developers can integrate PaLM API with MakerSuite

play20:06

and use it to access the API using a graphical user interface.

play20:11

The suite includes a number of different tools, such as a model training tool, a model deployment

play20:16

tool, and a model monitoring tool.

play20:18

... and what do these tools do?

play20:20

I'm so glad you asked.

play20:22

The model training tool helps developers train ML models on their data using different algorithms.

play20:27

The model deployment tool helps developers deploy ML models to production with a number

play20:32

of different deployment options.

play20:35

The model monitoring tool helps developers monitor the performance of their ML models

play20:39

in production using a dashboard and a number of different metrics.

play20:43

Lastly, there is Gemini, a multimodal AI model.

play20:49

Unlike traditional language models, it's not limited to understanding text alone.

play20:53

It can analyze images, understand the nuances of audio, and even interpret programming code.

play20:59

This allows Gemini to perform complex tasks that were previously impossible for AI.

play21:05

Due to its advanced architecture, Gemini is incredibly adaptable and scalable, making

play21:10

it suitable for diverse applications.

play21:13

Model Garden is continuously updated to include new models

play21:17

And now you know absolutely everything about Generative AI... okay maybe you don't know

play21:22

everything, but you definitely know the basics!

play21:25

Thank you for watching our course and make sure to check out our other videos if you

play21:29

want to learn more about how you can use AI!

Rate This

5.0 / 5 (0 votes)

Related Tags
人工智能机器学习深度学习生成模型模式识别内容创造技术革新行业应用编程辅助多模态
Do you need a summary in English?