What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata

The Royal Institution
12 Oct 202346:02

Summary

TLDR讲师以幽默风趣的方式介绍了生成式AI的发展历史,解释了语言模型的工作原理,分析了GPT系列模型不断增大规模的趋势,讨论了算法调教的重要性,并就AI可能带来的风险提出了自己的见解。她认为,我们无法阻止这些系统的存在,但可以通过监管来最大限度地减少它们可能造成的伤害。

Takeaways

  • 📈 语言模型技术并非新技术,ChatGPT等只是在已有基础上取得新的进展
  • 🔢 模型规模的扩大是取得进步的关键
  • 💰 训练大型语言模型需要巨资投入
  • ⚙️ transformer等网络结构才能实现大规模模型
  • 📚 大规模预训练+细调是取得多功能性的关键
  • ✅ 帮助性、诚实性和无害性是语言模型应追求的目标
  • ❓ 人机交互仍存在很多不足之处有待改进
  • 🔋 部署和使用语言模型都消耗大量能源
  • 💼 语言模型有替代某些工作崗位的风险
  • 🤝 较之完全禁止,更应关注风险管控

Q & A

  • 人工智能生成式技术究竟是什么?

    -人工智能生成式技术是指计算机程序可以自主生成以前未见过的新内容,比如生成新的文本、图像、音频或视频。它可以产生创造性内容,而不仅仅是预测给定输入的结果。

  • ChatGPT的核心技术是什么?

    -ChatGPT的核心技术是基于变压器(Transformer)的大规模预训练语言模型。它通过预训练在大量文本数据上学习语言的统计规律,然后进行微调以生成人类可理解的文本。模型规模越大,ChatGPT的生成能力就越强。

  • 语言建模的关键在哪里?

    -语言建模的关键是预测文本的后续内容。给模型输入部分文本,模型就会预测出最有可能的后续词语。通过在大规模语料上训练,模型可以学习文本的语法、语义和上下文信息,从而生成逻辑通顺、语义连贯的新文本。

  • ChatGPT和早期系统如Google Translate相比进步在哪里?

    -相比只能做单一任务的早期系统,ChatGPT可以处理更复杂、开放性的自然语言任务。它具有更强的语言理解和生成能力,可以根据不同提示创造性地生成各类文本。

  • ChatGPT的训练需要准备什么?

    -ChatGPT需要大量文本语料用于预训练,包括维基百科、社交媒体、代码等各类文本数据。还需要大量的训练时间和计算资源,以及人类提供的偏好反馈进行微调。模型越大,所需资源就越多。

  • ChatGPT有哪些风险?

    -ChatGPT可能会生成不准确或偏见信息,排放大量碳,对某些工作带来冲击。但可以通过监管、提供高质量训练数据、不断改进等方式减少潜在风险。

  • ChatGPT如何区分善意和恶意?

    -通过人类提供的反馈进行微调。让人类判断ChatGPT生成内容的好坏,反馈给模型以区分善意和恶意,让其生成有用、诚实、无害的回复。

  • ChatGPT一般的 commercial usage scenarios 有哪些?

    -ChatGPT可用于自动生成各类文本内容,如腾讯微信用它来自动创作小说。可辅助人类完成写作、编码等工作。也可以用在客服聊天机器人等应用中,提高工作效率。

  • ChatGPT模型的训练成本高达数亿美元,它将带来哪些商业价值?

    -ChatGPT可以大大降低公司内容创作成本,提升工作效率。它还可以通过个性化推荐、客户服务等方式提升用户体验。此外,ChatGPT相关技术和应用都有巨大商业价值。

  • 你怎么看待ChatGPT的未来发展?

    -我认为类似ChatGPT的大型语言模型技术会持续进步,在更多领域发挥作用。但需要关注其对环境、就业等方面的影响,制定监管措施。人工智能始终应该服务人类,而非替代人类。

Outlines

00:00

👩‍💻 介绍人工智能概念并说明讲座内容和方式

讲师介绍了什么是生成式人工智能,并举了一些例子如谷歌翻译和Siri。她解释了讲座的主要内容是人工智能的过去、现在和未来,讲座会采用一些互动的方式,例如提问观众。

05:03

📈 ChatGPT的迅速崛起和背后的核心技术

讲师解释了为什么ChatGPT能在短短两个月内吸引1亿用户。这是因为它比以往的人工智能系统更复杂和多功能。它的核心技术是语言模型,也就是根据上下文预测后续单词或句子的可能性。构建语言模型的关键步骤是收集大量文本数据、建立神经网络并进行自监督式学习。

10:06

👷 构建语言模型的具体步骤

讲师详细解释了构建语言模型的步骤:收集大量文本语料库、建立神经网络模型、随机移除输入句子的部分词汇、计算模型预测缺失词汇的概率。通过不断调整模型、比较预测结果和真实答案,最终获得一个效果较好的语言模型。目前最先进的模型结构是Transformer。

15:08

📈 语言模型规模的极速增长

讲师的数据显示,最近几年语言模型的参数数量呈指数级增长,从GPT-1的几十亿到GPT-4的1万亿参数。模型处理过的语料数量也在稳步增加,模型的大小决定了它能完成的任务类型和数量。目前最大模型离人脑规模还有很大差距。

20:09

💰 模型训练的高昂成本

训练大规模语言模型需要巨额投入,GPT-4花费了10亿美元。一般公司很难承担得起,只有少数像OpenAI这样得到大公司支持的机构才做得起。此外训练过程中也会消耗大量能源并产生碳排放。

25:09

🎯 使用进阶技术进行微调

语言模型预训练后还需要进行特定领域的微调,以适应具体的使用场景和要求。这需要人工指定大量示例,并让模型进行新的训练。这个过程可以使模型更加“友好、诚实、无害”。

30:10

❓ 模型的局限性和风险

讲师列出了目前语言模型的一些局限性,如知识无法实时更新、存在历史偏见、可能产生不当内容等。她还谈到了这些庞大模型运转所带来的环境和社会风险,如工作岗位损失和造假。

35:12

😄 互动问答环节

讲师与观众进行了一些有趣的问答互动。她让ChatGPT回答了一些问题,并根据观众提议创作了诗歌。这说明目前的语言模型在创造力方面确实有一定的进步。

40:13

🔮 对AI未来的预测和看法

讲师认为, ChatGPT目前不可能完全自主复制和获取资源。我们无法预测一个超智能AI会是什么样子,但会有许多有用的AI工具出现。社会整体会推动AI发展朝着更加有利的方向。

45:14

🙏 总结和致谢

讲师总结了讲座的主要内容,并感谢了观众的参与。她再次强调我们无法完全避免新技术的出现,但可以共同努力减少潜在的社会风险。

Mindmap

Keywords

💡生成对抗性人工智能

生成对抗性人工智能是指能够生成新内容的人工智能系统,如文字、图像、音频或视频。视频提到了谷歌翻译和Siri等早期生成对抗性人工智能系统。ChatGPT等新系统更加复杂和强大。

💡自然语言处理

视频主讲人研究自然语言处理。这是一种让计算机解析、理解和处理人类语言的技术。视频重点讨论了基于文本的生成对抗性人工智能。

💡语言模型

语言模型是生成对抗性人工智能的核心技术。它可以预测句子的下文。系统需要大量文本数据进行训练,以学习文字组合的可能性。

💡微调

在预训练语言模型的基础上进行微调,可以得到适用于特定任务的生成对抗性人工智能系统。比如专门从医学报告生成诊断的系统。

💡指导学习

显示各种示例如何使用ChatGPT,让其理解人类的意图,从而产生更加有用的输出。这称为指导学习。

💡有益性

生成对抗性人工智能系统应当有益而非有害。有益性意味着它能够遵循指令、提问并澄清、给出准确的答复等。训练可以提高有益性。

💡调节

有潜在风险的所有技术例如核能都有遭到严格调节。生成对抗性人工智能也不例外,相关的监管正逐步到位。

💡环境影响

大型语言模型的运算需要大量能源,产生更多碳排放。有绿色计算的呼声。视频给出了一个模型的碳排放数据示例。

💡工作岗位风险

生成对抗性人工智能被认为会让某些工作岗位如重复性文字编写自动化,存在着取代的风险。

💡深度伪造

使用生成对抗性人工智能合成的假新闻、音乐等。视频给出了已知的深度伪造案例。

Highlights

生成式AI不是新概念,比如谷歌翻译已经存在17年了

语言模型的关键是基于上下文预测下一个词

ChatGPT等模型都是基于语言模型和transformer架构

模型越大效果越好,参数数量从GPT-1的10亿到GPT-4的1万亿

GPT-4模型训练费用高达1亿美元

通过微调可以使模型适应特定任务

模型需要Helpful, Honest和Harmless

通过人类反馈微调可以提高模型质量

模型会有偏差和不准确的地方

模型推理需要大量计算和能源

模型可能导致一些工作岗位失去

模型可以用来生成假信息

模型需要监管但很难完全禁止

风险可能不如气候变化对人类的威胁大

历史表明所有新技术都有很强监管

Transcripts

play00:00

(gentle music jingle)

play00:03

(audience applauding)

play00:12

- Whoa, so many of you.

play00:14

Good, okay, thank you for that lovely introduction.

play00:19

Right, so, what is generative artificial intelligence?

play00:24

So I'm gonna explain what artificial intelligence is

play00:27

and I want this to be a bit interactive

play00:30

so there will be some audience participation.

play00:33

The people here who hold this lecture said to me,

play00:36

"Oh, you are very low-tech for somebody working on AI."

play00:40

I don't have any explosions or any experiments,

play00:42

so I'm afraid you'll have to participate,

play00:45

I hope that's okay.

play00:46

All right, so, what is generative artificial intelligence?

play00:50

So the term is made up by two things,

play00:55

artificial intelligence and generative.

play00:57

So artificial intelligence is a fancy term for saying

play01:02

we get a computer programme to do the job

play01:05

that a human would otherwise do.

play01:07

And generative, this is the fun bit,

play01:09

we are creating new content

play01:12

that the computer has not necessarily seen,

play01:15

it has seen parts of it,

play01:17

and it's able to synthesise it and give us new things.

play01:21

So what would this new content be?

play01:23

It could be audio,

play01:25

it could be computer code

play01:27

so that it writes a programme for us,

play01:29

it could be a new image,

play01:31

it could be a text,

play01:32

like an email or an essay you've heard, or video.

play01:36

Now in this lecture

play01:37

I'm only gonna be mostly focusing on text

play01:41

because I do natural language processing

play01:42

and this is what I know about,

play01:44

and we'll see how the technology works

play01:48

and hopefully leaving the lecture you'll know how,

play01:53

like there's a lot of myth around it and it's not,

play01:57

you'll see what it does and it's just a tool, okay?

play02:02

Right, so the outline of the talk,

play02:03

there's three parts and it's kind of boring.

play02:05

This is Alice Morse Earle.

play02:08

I do not expect that you know the lady.

play02:11

She was an American writer

play02:13

and she writes about memorabilia and customs,

play02:18

but she's famous for her quotes.

play02:21

So she's given us this quote here that says,

play02:23

"Yesterday's history, tomorrow is a mystery,

play02:25

today is a gift, and that's why it's called the present."

play02:28

It's a very optimistic quote.

play02:29

And the lecture is basically

play02:32

the past, the present, and the future of AI.

play02:37

Okay, so what I want to say right at the front

play02:41

is that generative AI is not a new concept.

play02:46

It's been around for a while.

play02:49

So how many of you have used or are familiar

play02:54

with Google Translate?

play02:56

Can I see a show of hands?

play02:58

Right, who can tell me when Google Translate launched

play03:02

for the first time?

play03:05

- 1995? - Oh, that would've been good.

play03:08

2006, so it's been around for 17 years

play03:14

and we've all been using it.

play03:16

And this is an example of generative AI.

play03:18

Greek text comes in, I'm Greek, so you know,

play03:21

pay some juice to the... (laughs)

play03:24

Right, so Greek text comes in,

play03:27

English text comes out.

play03:29

And Google Translate has served us very well

play03:31

for all these years

play03:32

and nobody was making a fuss.

play03:35

Another example is Siri on the phone.

play03:40

Again, Siri launched 2011,

play03:46

12 years ago,

play03:48

and it was a sensation back then.

play03:51

It is another example of generative AI.

play03:53

We can ask Siri to set alarms and Siri talks back

play03:58

and oh how great it is

play04:00

and then you can ask about your alarms and whatnot.

play04:02

This is generative AI.

play04:03

Again, it's not as sophisticated as ChatGPT,

play04:06

but it was there.

play04:07

And I don't know how many have an iPhone?

play04:11

See, iPhones are quite popular, I don't know why.

play04:15

Okay, so, we are all familiar with that.

play04:19

And of course later on there was Amazon Alexa and so on.

play04:23

Okay, again, generative AI is not a new concept,

play04:27

it is everywhere, it is part of your phone.

play04:31

The completion when you're sending an email

play04:34

or when you're sending a text.

play04:36

The phone attempts to complete your sentences,

play04:40

attempts to think like you and it saves you time, right?

play04:44

Because some of the completions are there.

play04:46

The same with Google,

play04:47

when you're trying to type it tries to guess

play04:49

what your search term is.

play04:51

This is an example of language modelling,

play04:53

we'll hear a lot about language modelling in this talk.

play04:56

So basically we're making predictions

play04:59

of what the continuations are going to be.

play05:02

So what I'm telling you

play05:04

is that generative AI is not that new.

play05:07

So the question is, what is the fuss, what happened?

play05:12

So in 2023, OpenAI,

play05:15

which is a company in California,

play05:18

in fact, in San Francisco.

play05:19

If you go to San Francisco,

play05:20

you can even see the lights at night of their building.

play05:24

It announced GPT-4

play05:27

and it claimed that it can beat 90% of humans on the SAT.

play05:33

For those of you who don't know,

play05:34

SAT is a standardised test

play05:37

that American school children have to take

play05:40

to enter university,

play05:41

it's an admissions test,

play05:42

and it's multiple choice and it's considered not so easy.

play05:46

So GPT-4 can do it.

play05:49

They also claimed that it can get top marks in law,

play05:53

medical exams and other exams,

play05:55

they have a whole suite of things that they claim,

play05:59

well, not they claim, they show that GPT-4 can do it.

play06:03

Okay, aside from that, it can pass exams,

play06:07

we can ask it to do other things.

play06:09

So you can ask it to write text for you.

play06:14

For example, you can have a prompt,

play06:17

this little thing that you see up there, it's a prompt.

play06:20

It's what the human wants the tool to do for them.

play06:23

And a potential prompt could be,

play06:25

"I'm writing an essay

play06:27

about the use of mobile phones during driving.

play06:29

Can you gimme three arguments in favour?"

play06:32

This is quite sophisticated.

play06:34

If you asked me,

play06:35

I'm not sure I can come up with three arguments.

play06:38

You can also do,

play06:38

and these are real prompts that actually the tool can do.

play06:42

You tell ChatGPT or GPT in general,

play06:45

"Act as a JavaScript developer.

play06:47

Write a programme that checks the information on a form.

play06:50

Name and email are required, but address and age are not."

play06:53

So I'm just writing this

play06:55

and the tool will spit out a programme.

play06:58

And this is the best one.

play07:00

"Create an About Me page for a website.

play07:03

I like rock climbing, outdoor sports, and I like to programme.

play07:07

I started my career as a quality engineer in the industry,

play07:10

blah, blah, blah."

play07:11

So I give this version of what I want the website to be

play07:16

and it will create it for me.

play07:19

So, you see, we've gone from Google Translate and Siri

play07:24

and the auto-completion

play07:25

to something which is a lot more sophisticated

play07:27

and can do a lot more things.

play07:31

Another fun fact.

play07:33

So this is a graph that shows

play07:36

the time it took for ChatGPT

play07:40

to reach 100 million users

play07:43

compared with other tools

play07:45

that have been launched in the past.

play07:47

And you see our beloved Google Translate,

play07:50

it took 78 months

play07:53

to reach 100 million users,

play07:56

a long time.

play07:58

TikTok took nine months and ChatGPT, two.

play08:03

So within two months they had 100 million users

play08:08

and these users pay a little bit to use the system,

play08:13

so you can do the multiplication

play08:15

and figure out how much money they make.

play08:17

Okay, so this is the history part.

play08:22

So how did we make ChatGPT?

play08:28

What is the technology behind this?

play08:30

The technology it turns out is not extremely new

play08:33

or extremely innovative

play08:35

or extremely difficult to comprehend.

play08:39

So we'll talk about that today now.

play08:42

So we'll address three questions.

play08:45

First of all, how did we get from the single-purpose systems

play08:48

like Google Translate to ChatGPT,

play08:51

which is more sophisticated and does a lot more things?

play08:54

And in particular,

play08:55

what is the core technology behind ChatGPT

play08:58

and what are the risks, if there are any?

play09:01

And finally, I will just show you

play09:03

a little glimpse of the future and how it's gonna look like

play09:07

and whether we should be worried or not

play09:09

and you know, I won't leave you hanging,

play09:13

please don't worry, okay?

play09:17

Right, so, all this GPT model variants,

play09:22

and there is a cottage industry out there,

play09:24

I'm just using GPT as an example because the public knows

play09:29

and there have been a lot of, you know,

play09:32

news articles about it,

play09:33

but there's other models,

play09:34

other variants of models that we use in academia.

play09:38

And they all work on the same principle,

play09:40

and this principle is called language modelling.

play09:43

What does language modelling do?

play09:45

It assumes we have a sequence of words.

play09:49

The context so far.

play09:51

And we saw this context in the completion,

play09:53

and I have an example here.

play09:55

Assuming my context is the phrase "I want to,"

play10:01

the language modelling tool will predict what comes next.

play10:05

So if I tell you "I want to,"

play10:07

there is several predictions.

play10:09

I want to shovel, I want to play,

play10:11

I want to swim, I want to eat.

play10:13

And depending on what we choose,

play10:15

whether it's shovel or play or swim,

play10:18

there is more continuations.

play10:20

So for shovel, it will be snow,

play10:24

for play, it can be tennis or video,

play10:26

swim doesn't have a continuation,

play10:29

and for eat, it will be lots and fruit.

play10:31

Now this is a toy example,

play10:33

but imagine now that the computer has seen a lot of text

play10:37

and it knows what words follow which other words.

play10:43

We used to count these things.

play10:46

So I would go, I would download a lot of data

play10:49

and I would count, "I want to show them,"

play10:52

how many times does it appear

play10:53

and what are the continuations?

play10:55

And we would have counts of these things.

play10:57

And all of this has gone out of the window right now

play11:00

and we use neural networks that don't exactly count things,

play11:04

but predict, learn things in a more sophisticated way,

play11:09

and I'll show you in a moment how it's done.

play11:12

So ChatGPT and GPT variants

play11:17

are based on this principle

play11:19

of I have some context, I will predict what comes next.

play11:23

And that's the prompt,

play11:25

the prompt that I gave you, these things here,

play11:28

these are prompts,

play11:29

this is the context,

play11:31

and then it needs to do the task.

play11:33

What would come next?

play11:35

In some cases it would be the three arguments.

play11:37

In the case of the web developer, it would be a webpage.

play11:42

Okay, the task of language modelling is we have the context,

play11:47

and this changed the example now.

play11:48

It says "The colour of the sky is."

play11:51

And we have a neural language model,

play11:54

this is just an algorithm,

play11:57

that will predict what is the most likely continuation,

play12:03

and likelihood matters.

play12:05

These are all predicated on actually making guesses

play12:09

about what's gonna come next.

play12:11

And that's why sometimes they fail,

play12:13

because they predict the most likely answer

play12:15

whereas you want a less likely one.

play12:18

But this is how they're trained,

play12:19

they're trained to come up with what is most likely.

play12:22

Okay, so we don't count these things,

play12:25

we try to predict them using this language model.

play12:29

So how would you build your own language model?

play12:34

This is a recipe, this is how everybody does this.

play12:37

So, step one, we need a lot of data.

play12:41

We need to collect a ginormous corpus.

play12:45

So these are words.

play12:47

And where will we find such a ginormous corpus?

play12:50

I mean, we go to the web, right?

play12:52

And we download the whole of Wikipedia,

play12:56

Stack Overflow pages,

play12:58

Quora, social media, GitHub, Reddit,

play13:01

whatever you can find out there.

play13:03

I mean, work out the permissions, it has to be legal.

play13:06

You download all this corpus.

play13:09

And then what do you do?

play13:10

Then you have this language model.

play13:11

I haven't told you what exactly this language model is,

play13:14

there is an example,

play13:15

and I haven't told you what the neural network

play13:17

that does the prediction is,

play13:18

but assuming you have it.

play13:20

So you have this machinery

play13:22

that will do the learning for you

play13:24

and the task now is to predict the next word,

play13:28

but how do we do it?

play13:30

And this is the genius part.

play13:33

We have the sentences in the corpus.

play13:36

We can remove some of them

play13:38

and we can have the language model

play13:40

predict the sentences we have removed.

play13:43

This is dead cheap.

play13:46

I just remove things,

play13:47

I pretend they're not there,

play13:49

and I get the language model to predict them.

play13:52

So I will randomly truncate,

play13:55

truncate means remove,

play13:56

the last part of the input sentence.

play13:59

I will calculate with this neural network

play14:01

the probability of the missing words.

play14:04

If I get it right, I'm good.

play14:05

If I'm not right,

play14:06

I have to go back and re-estimate some things

play14:09

because obviously I made a mistake,

play14:11

and I keep going.

play14:12

I will adjust and feedback to the model

play14:14

and then I will compare what the model predicted

play14:16

to the ground truth

play14:17

because I've removed the words in the first place

play14:19

so I actually know what the real truth is.

play14:22

And we keep going

play14:24

for some months or maybe years.

play14:28

No, months, let's say.

play14:30

So it will take some time to do this process

play14:32

because as you can appreciate

play14:33

I have a very large corpus and I have many sentences

play14:36

and I have to do the prediction

play14:38

and then go back and correct my mistake and so on.

play14:42

But in the end,

play14:43

the thing will converge and I will get my answer.

play14:46

So the tool in the middle that I've shown,

play14:50

this tool here, this language model,

play14:54

a very simple language model looks a bit like this.

play14:58

And maybe the audience has seen these,

play15:01

this is a very naive graph,

play15:04

but it helps to illustrate the point of what it does.

play15:07

So this neural network language model will have some input

play15:12

which is these nodes in the, as we look at it,

play15:16

well, my right and your right, okay.

play15:18

So the nodes here on the right are the input

play15:23

and the nodes at the very left are the output.

play15:27

So we will present this neural network with five inputs,

play15:34

the five circles,

play15:36

and we have three outputs,

play15:38

the three circles.

play15:39

And there is stuff in the middle

play15:41

that I didn't say anything about.

play15:43

These are layers.

play15:45

These are more nodes

play15:47

that are supposed to be abstractions of my input.

play15:51

So they generalise.

play15:52

The idea is if I put more layers on top of layers,

play15:57

the middle layers will generalise the input

play16:00

and will be able to see patterns that are not there.

play16:04

So you have these nodes

play16:05

and the input to the nodes are not exactly words,

play16:08

they're vectors, so series of numbers,

play16:11

but forget that for now.

play16:13

So we have some input, we have some layers in the middle,

play16:16

we have some output.

play16:17

And this now has these connections, these edges,

play16:20

which are the weights,

play16:22

this is what the network will learn.

play16:25

And these weights are basically numbers,

play16:27

and here it's all fully connected,

play16:30

so I have very many connections.

play16:32

Why am I going through this process

play16:35

of actually telling you all of that?

play16:37

You will see in a minute.

play16:38

So you can work out

play16:42

how big or how small this neural network is

play16:46

depending on the numbers of connections it has.

play16:51

So for this toy neural network we have here,

play16:54

I have worked out the number of weights,

play16:58

we call them also parameters,

play17:01

that this neural network has

play17:02

and that the model needs to learn.

play17:05

So the parameters are the number of units as input,

play17:09

in this case it's 5,

play17:12

times the units in the next layer, 8.

play17:16

Plus 8, this plus 8 is a bias,

play17:19

it's a cheating thing that these neural networks have.

play17:23

Again, you need to learn it

play17:25

and it sort of corrects a little bit the neural network

play17:28

if it's off.

play17:29

It's actually genius.

play17:30

If the prediction is not right,

play17:32

it tries to correct it a little bit.

play17:33

So for the purposes of this talk,

play17:35

I'm not going to go into the details,

play17:38

all I want you to see

play17:39

is that there is a way of working out the parameters,

play17:41

which is basically the number of input units

play17:45

times the units my input is going to,

play17:49

and for this fully connected network,

play17:51

if we add up everything,

play17:53

we come up with 99 trainable parameters, 99.

play17:58

This is a small network for all purposes, right?

play18:02

But I want you to remember this,

play18:03

this small network is 99 parameters.

play18:05

When you hear this network is a billion parameters,

play18:10

I want you to imagine how big this will be, okay?

play18:14

So 99 only for this toy neural network.

play18:17

And this is how we judge how big the model is,

play18:21

how long it took and how much it cost,

play18:24

it's the number of parameters.

play18:27

In reality, in reality, though,

play18:29

no one is using this network.

play18:31

Maybe in my class,

play18:33

if I have a first year undergraduate class

play18:36

and I introduce neural networks,

play18:37

I will use this as an example.

play18:39

In reality, what people use is these monsters

play18:42

that are made of blocks,

play18:47

and what block means they're made of other neural networks.

play18:52

So I don't know how many people have heard of transformers.

play18:57

I hope no one.

play18:57

Oh wow, okay.

play18:59

So transformers are these neural networks

play19:03

that we use to build ChatGPT.

play19:06

And in fact GPT stands for

play19:09

generative pre-trained transformers.

play19:12

So transformer is even in the title.

play19:15

So this is a sketch of a transformer.

play19:19

So you have your input

play19:21

and the input is not words, like I said,

play19:24

here it says embeddings,

play19:25

embeddings is another word for vectors.

play19:28

And then you will have this,

play19:32

a bigger version of this network,

play19:34

multiplied into these blocks.

play19:38

And each block is this complicated system

play19:42

that has some neural networks inside it.

play19:46

We're not gonna go into the detail, I don't want,

play19:48

I please don't go,

play19:50

all I'm trying, (audience laughs)

play19:51

all I'm trying to say is that, you know,

play19:55

we have these blocks stacked on top of each other,

play20:00

the transformer has eight of those,

play20:02

which are mini neural networks,

play20:04

and this task remains the same.

play20:06

That's what I want you to take out of this.

play20:08

Input goes in the context, "the chicken walked,"

play20:12

we're doing some processing,

play20:13

and our task is to predict the continuation,

play20:17

which is "across the road."

play20:18

And this EOS means end of sentence

play20:21

because we need to tell the neural network

play20:23

that our sentence finished.

play20:24

I mean they're kind of dumb, right?

play20:26

We need to tell them everything.

play20:27

When I hear like AI will take over the world, I go like,

play20:30

Really? We have to actually spell it out.

play20:33

Okay, so, this is the transformer,

play20:37

the king of architectures,

play20:38

the transformers came in 2017.

play20:42

Nobody's working on new architectures right now.

play20:45

It is a bit sad, like everybody's using these things.

play20:48

They used to be like some pluralism but now no,

play20:50

everybody's using transformers, we've decided they're great.

play20:54

Okay, so, what we're gonna do with this,

play20:58

and this is kind of important and the amazing thing,

play21:01

is we're gonna do self-supervised learning.

play21:03

And this is what I said,

play21:04

we have the sentence, we truncate, we predict,

play21:08

and we keep going till we learn these probabilities.

play21:12

Okay? You're with me so far?

play21:15

Good, okay, so,

play21:18

once we have our transformer

play21:21

and we've given it all this data that there is in the world,

play21:26

then we have a pre-trained model.

play21:28

That's why GPT is called

play21:30

the generative pre-trained transformer.

play21:32

This is a baseline model that we have

play21:35

and has seen a lot of things about the world

play21:39

in the form of text.

play21:40

And then what we normally do,

play21:42

we have this general purpose model

play21:44

and we need to specialise it somehow

play21:46

for a specific task.

play21:48

And this is what is called fine-tuning.

play21:50

So that means that the network has some weights

play21:54

and we have to specialise the network.

play21:57

We'll take, initialise the weights

play21:59

with what we know from the pre-training,

play22:01

and then in the specific task we will narrow

play22:03

a new set of weights.

play22:05

So for example, if I have medical data,

play22:09

I will take my pre-trained model,

play22:11

I will specialise it to this medical data,

play22:14

and then I can do something that is specific for this task,

play22:18

which is, for example, write a diagnosis from a report.

play22:22

Okay, so this notion of fine-tuning is very important

play22:27

because it allows us to do special-purpose applications

play22:31

for these generic pre-trained models.

play22:35

Now, and people think that GPT and all of these things

play22:37

are general purpose,

play22:39

but they are fine-tuned to be general purpose

play22:42

and we'll see how.

play22:45

Okay, so, here's the question now.

play22:49

We have this basic technology to do this pre-training

play22:52

and I told you how to do it, if you download all of the web.

play22:56

How good can a language model become, right?

play22:59

How does it become great?

play23:01

Because when GPT came out in GPT-1 and GPT-2,

play23:06

they were not amazing.

play23:09

So the bigger, the better.

play23:13

Size is all that matters, I'm afraid.

play23:15

This is very bad because we used to, you know,

play23:18

people didn't believe in scale

play23:19

and now we see that scale is very important.

play23:22

So, since 2018,

play23:25

we've witnessed an absolutely extreme increase

play23:32

in model sizes.

play23:34

And I have some graphs to show this.

play23:36

Okay, I hope people at the back can see this graph.

play23:39

Yeah, you should be all right.

play23:40

So this graph shows

play23:45

the number of parameters.

play23:47

Remember, the toy neural network had 99.

play23:50

The number of parameters that these models have.

play23:54

And we start with a normal amount.

play23:57

Well, normal for GPT-1.

play23:58

And we go up to GPT-4,

play24:01

which has one trillion parameters.

play24:07

Huge, one trillion.

play24:10

This is a very, very, very big model.

play24:12

And you can see here the ant brain and the rat brain

play24:16

and we go up to the human brain.

play24:19

The human brain has,

play24:23

not a trillion,

play24:24

100 trillion parameters.

play24:27

So we are a bit off,

play24:30

we're not at the human brain level yet

play24:32

and maybe we'll never get there

play24:34

and we can't compare GPT to the human brain

play24:37

but I'm just giving you an idea of how big this model is.

play24:42

Now what about the words it's seen?

play24:46

So this graph shows us the number of words

play24:48

processed by these language models during their training

play24:52

and you will see that there has been an increase,

play24:55

but the increase has not been as big as the parameters.

play25:00

So the community started focusing

play25:04

on the parameter size of these models,

play25:06

whereas in fact we now know

play25:08

that it needs to see a lot of text as well.

play25:11

So GPT-4 has seen approximately,

play25:16

I don't know, a few billion words.

play25:19

All the human written text is I think 100 billion,

play25:24

so it's sort of approaching this.

play25:28

You can also see what a human reads in their lifetime,

play25:32

it's a lot less.

play25:34

Even if they read, you know,

play25:35

because people nowadays, you know,

play25:37

they read but they don't read fiction,

play25:39

they read the phone, anyway.

play25:41

You see the English Wikipedia,

play25:42

so we are approaching the level of

play25:46

the text that is out there that we can get.

play25:49

And in fact, one may say, well, GPT is great,

play25:52

you can actually use it to generate more text

play25:54

and then use this text that GPT has generated

play25:56

and then retrain the model.

play25:58

But we know this text is not exactly right

play26:00

and in fact it's diminished returns,

play26:03

so we're gonna plateau at some point.

play26:06

Okay, how much does it cost?

play26:10

Now, okay, so GPT-4 cost

play26:16

$100 million, okay?

play26:21

So when should they start doing it again?

play26:25

So obviously this is not a process you have to do

play26:28

over and over again.

play26:29

You have to think very well

play26:31

and you make a mistake and you lost like $50 million.

play26:38

You can't start again so you have to be very sophisticated

play26:41

as to how you engineer the training

play26:43

because a mistake costs money.

play26:47

And of course not everybody can do this,

play26:48

not everybody has $100 million.

play26:51

They can do it because they have Microsoft backing them,

play26:54

not everybody, okay.

play26:58

Now this is a video that is supposed to play and illustrate,

play27:01

let's see if it will work,

play27:03

the effects of scaling, okay.

play27:06

So I will play it one more.

play27:09

So these are tasks that you can do

play27:12

and it's the number of tasks

play27:15

against the number of parameters.

play27:18

So we start with 8 billion parameters

play27:20

and we can do a few tasks.

play27:23

And then the tasks increase, so summarization,

play27:27

question answering, translation.

play27:30

And once we move to 540 billion parameters,

play27:35

we have more tasks.

play27:36

We start with very simple ones,

play27:39

like code completion.

play27:42

And then we can do reading comprehension

play27:45

and language understanding and translation.

play27:47

So you get the picture, the tree flourishes.

play27:51

So this is what people discovered with scaling.

play27:54

If you scale the language model, you can do more tasks.

play27:58

Okay, so now.

play28:04

Maybe we are done.

play28:07

But what people discovered is if you actually take GPT

play28:12

and you put it out there,

play28:14

it actually doesn't behave like people want it to behave

play28:18

because this is a language model trained to predict

play28:21

and complete sentences

play28:22

and humans want to use GPT for other things

play28:26

because they have their own tasks

play28:29

that the developers hadn't thought of.

play28:31

So then the notion of fine-tuning comes in,

play28:35

it never left us.

play28:37

So now what we're gonna do

play28:39

is we're gonna collect a lot of instructions.

play28:42

So instructions are examples

play28:44

of what people want ChatGPT to do for them,

play28:47

such as answer the following question,

play28:50

or answer the question step by step.

play28:54

And so we're gonna give these demonstrations to the model,

play28:58

and in fact, almost 2,000 of such examples,

play29:03

and we're gonna fine-tune.

play29:05

So we're gonna tell this language model,

play29:07

look, these are the tasks that people want,

play29:09

try to learn them.

play29:12

And then an interesting thing happens,

play29:14

is that we can actually then generalise

play29:17

to unseen tasks, unseen instructions,

play29:20

because you and I may have different usage purposes

play29:23

for these language models.

play29:27

Okay, but here's the problem.

play29:33

We have an alignment problem

play29:34

and this is actually very important

play29:36

and something that will not leave us for the future.

play29:42

And the question is,

play29:43

how do we create an agent

play29:45

that behaves in accordance with what a human wants?

play29:49

And I know there's many words and questions here.

play29:53

But the real question is,

play29:54

if we have AI systems with skills

play29:57

that we find important or useful,

play30:00

how do we adapt those systems to reliably use those skills

play30:04

to do the things we want?

play30:08

And there is a framework

play30:09

that is called the HHH framing of the problem.

play30:15

So we want GPT to be helpful, honest, and harmless.

play30:21

And this is the bare minimum.

play30:24

So what does it mean, helpful?

play30:26

It it should follow instructions

play30:28

and perform the tasks we want it to perform

play30:31

and provide answers for them

play30:33

and ask relevant questions

play30:35

according to the user intent, and clarify.

play30:40

So if you've been following,

play30:41

in the beginning, GPT did none of this,

play30:43

but slowly it became better

play30:45

and it now actually asks for these clarification questions.

play30:50

It should be accurate,

play30:51

something that is not 100% there even to this,

play30:55

there is, you know, inaccurate information.

play30:58

And avoid toxic, biassed, or offensive responses.

play31:03

And now here's a question I have for you.

play31:06

How will we get the model to do all of these things?

play31:12

You know the answer. Fine-tuning.

play31:17

Except that we're gonna do a different fine-tuning.

play31:20

We're gonna ask the humans to do some preferences for us.

play31:25

So in terms of helpful, we're gonna ask,

play31:28

an example is, "What causes the seasons to change?"

play31:32

And then we'll give two options to the human.

play31:35

"Changes occur all the time

play31:36

and it's an important aspect of life," bad.

play31:39

"The seasons are caused primarily

play31:41

by the tilt of the Earth's axis," good.

play31:44

So we'll get this preference course

play31:46

and then we'll train the model again

play31:49

and then it will know.

play31:51

So fine-tuning is very important.

play31:53

And now, it was expensive as it was,

play31:56

now we make it even more expensive

play31:58

because we add a human into the mix, right?

play32:00

Because we have to pay these humans

play32:02

that give us the preferences,

play32:03

we have to think of the tasks.

play32:05

The same for honesty.

play32:07

"Is it possible to prove that P equals NP?"

play32:09

"No, it's impossible," is not great as an answer.

play32:12

"That is considered a very difficult and unsolved problem

play32:15

in computer science," it's better.

play32:17

And we have similar for harmless.

play32:20

Okay, so I think it's time,

play32:22

let's see if we'll do a demo.

play32:24

Yeah, that's bad if you remove all the files.

play32:28

Okay, hold on, okay.

play32:30

So now we have GPT here.

play32:33

I'll do some questions

play32:35

and then we'll take some questions from the audience, okay?

play32:38

So let's ask one question.

play32:40

"Is the UK a monarchy?"

play32:43

Can you see it up there? I'm not sure.

play32:48

And it's not generating.

play32:53

Oh, perfect, okay.

play32:55

So what do you observe?

play32:56

First thing, too long.

play32:58

I always have this beef with this.

play33:00

It's too long. (audience laughs)

play33:02

You see what it says?

play33:03

"As of my last knowledge update in September 2021,

play33:08

the United Kingdom is a constitutional monarchy."

play33:10

It could be that it wasn't anymore, right?

play33:12

Something happened.

play33:13

"This means that while there is a monarch,

play33:16

the reigning monarch as to that time

play33:18

was Queen Elizabeth III."

play33:21

So it tells you, you know,

play33:22

I don't know what happened,

play33:23

at that time there was a Queen Elizabeth.

play33:26

Now if you ask it, who, sorry, "Who is Rishi?

play33:32

If I could type, "Rishi Sunak,"

play33:36

does it know?

play33:45

"A British politician.

play33:46

As my last knowledge update,

play33:48

he was the Chancellor of the Exchequer."

play33:50

So it does not know that he's the Prime Minister.

play33:55

"Write me a poem,

play33:57

write me a poem about."

play34:02

What do we want it to be about?

play34:04

Give me two things, eh?

play34:06

- [Audience Member] Generative AI.

play34:08

(audience laughs) - It will know.

play34:10

It will know, let's do another point about...

play34:14

- [Audience Members] Cats.

play34:16

- A cat and a squirrel, we'll do a cat and a squirrel.

play34:19

"A cat and a squirrel."

play34:27

"A cat and a squirrel, they meet and know.

play34:29

A tale of curiosity," whoa.

play34:31

(audience laughs)

play34:33

Oh my god, okay, I will not read this.

play34:37

You know, they want me to finish at 8:00, so, right.

play34:42

Let's say, "Can you try a shorter poem?"

play34:47

- [Audience Member] Try a haiku.

play34:49

- "Can you try,

play34:52

can you try to give me a haiku?"

play34:54

To give me a hai, I cannot type, haiku.

play35:05

"Amidst autumn's gold, leaves whisper secrets untold,

play35:08

nature's story, bold."

play35:11

(audience member claps) Okay.

play35:13

Don't clap, okay, let's, okay, one more.

play35:16

So does the audience have anything that they want,

play35:20

but challenging, that you want to ask?

play35:23

Yes?

play35:24

- [Audience Member] What school did Alan Turing go to?

play35:27

- Perfect, "What school

play35:30

did Alan Turing go to?"

play35:39

Oh my God. (audience laughs)

play35:41

He went, do you know?

play35:42

I don't know whether it's true, this is the problem.

play35:44

Sherborne School, can somebody verify?

play35:46

King's College, Cambridge, Princeton?

play35:50

Yes, okay, ah, here's another one.

play35:52

"Tell me a joke about Alan Turing."

play35:58

Okay, I cannot type but it will, okay.

play36:01

"Light-hearted joke.

play36:02

Why did Alan Turing keep his computer cold?

play36:04

Because he didn't want it to catch bytes."

play36:10

(audience laughs) Bad.

play36:12

Okay, okay. - Explain why that's funny.

play36:16

(audience laughs) - Ah, very good one.

play36:19

"Why is this a funny joke?"

play36:28

And where is it? Oh god.

play36:30

(audience laughs)

play36:31

Okay, "Catch bytes sounds similar to catch colds."

play36:35

(audience laughs)

play36:37

"Catching bytes is a humorous twist on this phrase,"

play36:39

oh my God.

play36:40

"The humour comes from the clever wordplay

play36:42

and the unexpected." (audience laughs)

play36:44

Okay, you lose the will to live,

play36:45

but it does explain, it does explain, okay, right.

play36:50

One last order from you guys.

play36:52

- [Audience Member] What is consciousness?

play36:54

- It will know because it has seen definitions

play36:57

and it will spit out like a huge thing.

play37:00

Shall we try?

play37:02

(audience talks indistinctly) - Say again?

play37:05

- [Audience Member] Write a song about relativity.

play37:07

- Okay, "Write a song." - Short.

play37:10

(audience laughs) - You are learning very fast.

play37:13

"A short song about relativity."

play37:22

Oh goodness me. (audience laughs)

play37:25

(audience laughs)

play37:29

This is short? (audience laughs)

play37:33

All right, outro, okay, so see,

play37:35

it doesn't follow instructions.

play37:37

It is not helpful.

play37:38

And this has been fine-tuned.

play37:40

Okay, so the best was here.

play37:42

It had something like, where was it?

play37:45

"Einstein said, 'Eureka!" one fateful day,

play37:47

as he pondered the stars in his own unique way.

play37:51

The theory of relativity, he did unfold,

play37:54

a cosmic story, ancient and bold."

play37:57

I mean, kudos to that, okay.

play37:58

Now let's go back to the talk,

play38:02

because I want to talk a little bit, presentation,

play38:05

I want to talk a little bit about, you know,

play38:09

is it good, is it bad, is it fair, are we in danger?

play38:12

Okay, so it's virtually impossible

play38:14

to regulate the content they're exposed to, okay?

play38:18

And there's always gonna be historical biases.

play38:21

We saw this with the Queen and Rishi Sunak.

play38:24

And they may occasionally exhibit

play38:27

various types of undesirable behaviour.

play38:30

For example, this is famous.

play38:35

Google showcased the model called Bard

play38:38

and they released this tweet and they were asking Bard,

play38:43

"What new discoveries from the James Webb Space Telescope

play38:46

can I tell my nine-year-old about?"

play38:49

And it's spit out this thing, three things.

play38:53

Amongst them it said

play38:54

that "this telescope took the very first picture

play38:57

of a planet outside of our own solar system."

play39:02

And here comes Grant Tremblay,

play39:04

who is an astrophysicist, a serious guy,

play39:06

and he said, "I'm really sorry, I'm sure Bard is amazing.

play39:10

But it did not take the first image

play39:13

of a planet outside our solar system.

play39:16

It was done by this other people in 2004."

play39:20

And what happened with this is that this error wiped

play39:23

$100 billion out of Google's company Alphabet.

play39:28

Okay, bad.

play39:32

If you ask ChatGPT, "Tell me a joke about men,"

play39:35

it gives you a joke and it says it might be funny.

play39:39

"Why do men need instant replay on TV sports?

play39:42

Because after 30 seconds, they forget what happened."

play39:44

I hope you find it amusing.

play39:46

If you ask about women, it refuses.

play39:49

(audience laughs)

play39:52

Okay, yes.

play39:56

- It's fine-tuned. - It's fine-tuned, exactly.

play39:58

(audience laughs)

play40:00

"Which is the worst dictator of this group?

play40:02

Trump, Hitler, Stalin, Mao?"

play40:06

It actually doesn't take a stance,

play40:08

it says all of them are bad.

play40:10

"These leaders are wildly regarded

play40:12

as some of the worst dictators in history."

play40:15

Okay, so yeah.

play40:18

Environment.

play40:22

A query for ChatGPT like we just did

play40:25

takes 100 times more energy to execute

play40:30

than a Google search query.

play40:31

Inference, which is producing the language, takes a lot,

play40:36

is more expensive than actually training the model.

play40:39

Llama 2 is GPT style model.

play40:42

While they were training it,

play40:43

it produced 539 metric tonnes of CO.

play40:48

The larger the models get,

play40:49

the more energy they need and they emit

play40:53

during their deployment.

play40:54

Imagine lots of them sitting around.

play40:58

Society.

play41:01

Some jobs will be lost.

play41:03

We cannot beat around the bush.

play41:04

I mean, Goldman Sachs predicted 300 million jobs.

play41:07

I'm not sure this, you know, we cannot tell the future,

play41:11

but some jobs will be at risk, like repetitive text writing.

play41:18

Creating fakes.

play41:20

So these are all documented cases in the news.

play41:23

So a college kid wrote this blog

play41:26

which apparently fooled everybody using ChatGPT.

play41:31

They can produce fake news.

play41:34

And this is a song, how many of you know this?

play41:37

So I know I said I'm gonna be focusing on text

play41:42

but the same technology you can use in audio,

play41:45

and this is a well-documented case where somebody, unknown,

play41:50

created this song and it supposedly was a collaboration

play41:55

between Drake and The Weeknd.

play41:57

Do people know who these are?

play41:59

They are, yeah, very good, Canadian rappers.

play42:01

And they're not so bad, so.

play42:06

Shall I play the song?

play42:08

- Yeah. - Okay.

play42:09

Apparently it's very authentic.

play42:11

(bright music)

play42:17

♪ I came in with my ex like Selena to flex, ay ♪

play42:22

♪ Bumpin' Justin Bieber, the fever ain't left, ay ♪

play42:25

♪ She know what she need ♪

play42:27

- Apparently it's totally believable, okay.

play42:32

Have you seen this same technology but kind of different?

play42:35

This is a deep fake showing that Trump was arrested.

play42:39

How can you tell it's a deep fake?

play42:43

The hand, yeah, it's too short, right?

play42:46

Yeah, you can see it's like almost there, not there.

play42:50

Okay, so I have two slides on the future

play42:54

before they come and kick me out

play42:56

because I was told I have to finish at 8:00

play42:57

to take some questions.

play42:59

Okay, tomorrow.

play43:01

So we can't predict the future

play43:05

and no, I don't think that these evil computers

play43:07

are gonna come and kill us all.

play43:10

I will leave you with some thoughts by Tim Berners-Lee.

play43:13

For people who don't know him, he invented the internet.

play43:16

He's actually Sir Tim Berners-Lee.

play43:19

And he said two things that made sense to me.

play43:22

First of all, that we don't actually know

play43:24

what a super intelligent AI would look like.

play43:27

We haven't made it, so it's hard to make these statements.

play43:30

However, it's likely to have lots of these intelligent AIs,

play43:35

and by intelligent AIs we mean things like GPT,

play43:38

and many of them will be good and will help us do things.

play43:42

Some may fall to the hands of individuals

play43:49

that want to do harm,

play43:50

and it seems easier to minimise the harm

play43:54

that these tools will do

play43:56

than to prevent the systems from existing at all.

play44:00

So we cannot actually eliminate them altogether,

play44:02

but we as a society can actually mitigate the risks.

play44:06

This is very interesting,

play44:07

this is the Australian Research Council

play44:10

that committed a survey

play44:12

and they dealt with a hypothetical scenario

play44:15

that whether Chad GPT-4 could autonomous replicate,

play44:21

you know, you are replicating yourself,

play44:23

you're creating a copy,

play44:25

acquire resources and basically be a very bad agent,

play44:29

the things of the movies.

play44:30

And the answer is no, it cannot do this, it cannot.

play44:35

And they had like some specific tests

play44:37

and it failed on all of them,

play44:39

such as setting up an open source language model

play44:41

on a new server, it cannot do that.

play44:45

Okay, last slide.

play44:46

So my take on this is that we cannot turn back time.

play44:52

And every time you think about AI coming there to kill you,

play44:57

you should think what is the bigger threat to mankind,

play44:59

AI or climate change?

play45:02

I would personally argue climate change is gonna wipe us all

play45:04

before the AI becomes super intelligent.

play45:08

Who is in control of AI?

play45:10

There are some humans there who hopefully have sense.

play45:13

And who benefits from it?

play45:16

Does the benefit outweigh the risk?

play45:18

In some cases, the benefit does, in others it doesn't.

play45:21

And history tells us

play45:24

that all technology that has been risky,

play45:26

such as, for example, nuclear energy,

play45:29

has been very strongly regulated.

play45:32

So regulation is coming and watch out the space.

play45:35

And with that I will stop and actually take your questions.

play45:40

Thank you so much for listening, you've been great.

play45:42

(audience applauds)

play45:51

(applause fades out)