What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata

The Royal Institution
12 Oct 202346:02

Summary

TLDRThe speaker explains the history and current capabilities of generative AI systems like ChatGPT, which can generate human-like text. She covers how these systems work via neural network 'language models', predicting text continuations. After illustrating ChatGPT responses, she notes challenges around aligning AI system goals with human values. She concludes that while risks exist, regulation and mitigation may prevent harm, arguing climate change poses a bigger threat than AI.

Takeaways

  • 👩‍💻 Generative AI like ChatGPT has been around for a while (e.g. Google Translate), but recent models are more sophisticated
  • 🧠 Language models are trained to predict the next word given context, allowing text generation
  • ⚙️ Transformers have become the go-to model architecture for language tasks
  • 📈 Model performance improves with scale - more data, parameters, compute
  • 💰 Developing models like GPT-4 is extremely expensive ($100 million+)
  • 🔧 Models still need task-specific fine-tuning to be helpful, honest and harmless
  • 😀 Benefits like text generation and question answering can be very useful
  • ❓ Risks exist around bias, fakes and environment impact
  • 🛡️ Regulation and risk mitigation strategies are being developed
  • 🔮 The future of AI will likely involve many task-specific models

Outlines

00:00

๐ŸŽค Introduction and Background ๐ŸŽค

The speaker introduces herself and explains what generative AI is. She gives examples like Google Translate and Siri which have been around for years. The audience will participate to make it interactive.

05:03

๐Ÿ“ˆ History: Evolution of Generative AI ๐Ÿ“ˆ

The speaker explains how we went from single purpose systems like Google Translate to more sophisticated ones like ChatGPT. She talks about the core technology behind ChatGPT which is language modeling - predicting the next word given previous words.

10:06

๐Ÿง How Language Models Work๐Ÿง

The speaker explains how language models are trained - on massive amounts of text data from the web. She illustrates with a simple neural network how models make predictions. Modern models use more complex transformer architectures.

15:08

๐Ÿ”ฌ Why Bigger Models Work Better ๐Ÿ”ฌ

The speaker shows graphs depicting the exponential growth in model sizes and data used to train them. Bigger models can perform more tasks. Fine-tuning specializes them for specific tasks.

20:09

โš–๏ธ Aligning Models to Human Preferences โš–๏ธ

The speaker explains the challenge of aligning AI to human preferences. Models need to be helpful, honest and harmless. This requires further fine-tuning based on human feedback.

25:09

๐Ÿคฉ ChatGPT Demo and Limitations ๐Ÿคฉ

The speaker does a live demo with ChatGPT, showing it can answer questions but has limitations in accuracy and following instructions.

30:10

๐Ÿค” Reflections on Societal Impact๐Ÿค”

The speaker reflects on concerns like bias, environment impact, jobs at risk and creation of fakes. But benefits may outweigh risks and regulation can help mitigate harm.

35:12

๐ŸŒ„ The Future of AI ๐ŸŒ„

The speaker concludes that AI like ChatGPT is unlikely to become super intelligent soon. Climate change is a bigger threat. Regulation and social responsibility are key.

Mindmap

Keywords

💡generative AI

Generative AI refers to artificial intelligence that can create new content such as text, images, audio or video. In the video, the speaker focuses on generative AI that creates text. She gives examples of generative AI including Google Translate, auto-text completion, and ChatGPT, which can write emails, code, web pages etc. based on a prompt from the user.

💡language model

A language model is a key component of generative AI systems like ChatGPT. It tries to predict the next word(s) that are likely to appear given the previous words and context. The video explains how language models are trained on huge text datasets to learn these word probabilities.

💡transformer

Transformers are a type of neural network architecture that is used to build language models like GPT. They are composed of encoder and decoder blocks and can process entire sequences of text input. The video describes transformers as the "king of architectures" used in state-of-the-art generative AI.

💡fine-tuning

Fine-tuning refers to further training of a pretrained language model on a downstream task-specific dataset. This adapts the model to a particular domain or use case. The video emphasizes the importance of fine-tuning for alignment with human preferences.

💡scaling

In the AI field currently, model performance improves dramatically with scale (billions of parameters and huge datasets). The video shows how more advanced capabilities are unlocked in models like GPT as their scale increases.

💡helpful, honest, harmless (HHH)

The video introduces the HHH framing - AI systems should be helpful, honest and harmless. Ensuring AI behavior aligns with human preferences is an active area of research. Fine-tuning on human feedback is one approach.

💡bias

Like all machine learning systems, biases in the data can lead to biased model behavior. The video gives examples of gender bias and outdated factual knowledge in ChatGPT.

💡energy usage

Training and running ever-larger AI models consumes huge amounts of energy. The video highlights the high carbon emissions associated with models like GPT.

💡job displacement

The video notes that generative AI could automate some human jobs involving repetitive text generation. This may negatively impact certain professions.

💡regulation

Given risks like bias and job loss, responsible regulation of generative AI technology is needed. The video suggests regulation may follow the pattern of other potentially dangerous technologies like nuclear power.

Highlights

Generative AI is not a new concept, it has been around for a while in tools like Google Translate and Siri

ChatGPT is based on the principle of language modeling - predicting the next word given previous words as context

To build a language model: collect lots of text data, truncate sentences, predict missing words, adjust model till accurate

Bigger models with more parameters perform better at more tasks - model size is key to progress in AI

GPT-4 cost $100 million to develop - mistakes are hugely expensive at this scale

Fine-tuning adapts the general pre-trained model to specific tasks using human preferences and instructions

Alignment remains an issue - how to ensure AI systems use skills to do what humans want

Helpfulness, honesty and harmlessness are key factors for reliability

Inference takes more energy than training these large models - sustainability issues

Some job loss is likely from automation of repetitive text tasks

Fakes are already being generated e.g. college essay, fake collaboration songs

Regulation will come, as with past technologies like nuclear power

Climate change likely poses a bigger threat than AI

Humans are still in control of deploying and using AI systems

History shows risky technology gets regulated, but turning back time is impossible

Transcripts

play00:00

(gentle music jingle)

play00:03

(audience applauding)

play00:12

- Whoa, so many of you.

play00:14

Good, okay, thank you for that lovely introduction.

play00:19

Right, so, what is generative artificial intelligence?

play00:24

So I'm gonna explain what artificial intelligence is

play00:27

and I want this to be a bit interactive

play00:30

so there will be some audience participation.

play00:33

The people here who hold this lecture said to me,

play00:36

"Oh, you are very low-tech for somebody working on AI."

play00:40

I don't have any explosions or any experiments,

play00:42

so I'm afraid you'll have to participate,

play00:45

I hope that's okay.

play00:46

All right, so, what is generative artificial intelligence?

play00:50

So the term is made up by two things,

play00:55

artificial intelligence and generative.

play00:57

So artificial intelligence is a fancy term for saying

play01:02

we get a computer programme to do the job

play01:05

that a human would otherwise do.

play01:07

And generative, this is the fun bit,

play01:09

we are creating new content

play01:12

that the computer has not necessarily seen,

play01:15

it has seen parts of it,

play01:17

and it's able to synthesise it and give us new things.

play01:21

So what would this new content be?

play01:23

It could be audio,

play01:25

it could be computer code

play01:27

so that it writes a programme for us,

play01:29

it could be a new image,

play01:31

it could be a text,

play01:32

like an email or an essay you've heard, or video.

play01:36

Now in this lecture

play01:37

I'm only gonna be mostly focusing on text

play01:41

because I do natural language processing

play01:42

and this is what I know about,

play01:44

and we'll see how the technology works

play01:48

and hopefully leaving the lecture you'll know how,

play01:53

like there's a lot of myth around it and it's not,

play01:57

you'll see what it does and it's just a tool, okay?

play02:02

Right, so the outline of the talk,

play02:03

there's three parts and it's kind of boring.

play02:05

This is Alice Morse Earle.

play02:08

I do not expect that you know the lady.

play02:11

She was an American writer

play02:13

and she writes about memorabilia and customs,

play02:18

but she's famous for her quotes.

play02:21

So she's given us this quote here that says,

play02:23

"Yesterday's history, tomorrow is a mystery,

play02:25

today is a gift, and that's why it's called the present."

play02:28

It's a very optimistic quote.

play02:29

And the lecture is basically

play02:32

the past, the present, and the future of AI.

play02:37

Okay, so what I want to say right at the front

play02:41

is that generative AI is not a new concept.

play02:46

It's been around for a while.

play02:49

So how many of you have used or are familiar

play02:54

with Google Translate?

play02:56

Can I see a show of hands?

play02:58

Right, who can tell me when Google Translate launched

play03:02

for the first time?

play03:05

- 1995? - Oh, that would've been good.

play03:08

2006, so it's been around for 17 years

play03:14

and we've all been using it.

play03:16

And this is an example of generative AI.

play03:18

Greek text comes in, I'm Greek, so you know,

play03:21

pay some juice to the... (laughs)

play03:24

Right, so Greek text comes in,

play03:27

English text comes out.

play03:29

And Google Translate has served us very well

play03:31

for all these years

play03:32

and nobody was making a fuss.

play03:35

Another example is Siri on the phone.

play03:40

Again, Siri launched 2011,

play03:46

12 years ago,

play03:48

and it was a sensation back then.

play03:51

It is another example of generative AI.

play03:53

We can ask Siri to set alarms and Siri talks back

play03:58

and oh how great it is

play04:00

and then you can ask about your alarms and whatnot.

play04:02

This is generative AI.

play04:03

Again, it's not as sophisticated as ChatGPT,

play04:06

but it was there.

play04:07

And I don't know how many have an iPhone?

play04:11

See, iPhones are quite popular, I don't know why.

play04:15

Okay, so, we are all familiar with that.

play04:19

And of course later on there was Amazon Alexa and so on.

play04:23

Okay, again, generative AI is not a new concept,

play04:27

it is everywhere, it is part of your phone.

play04:31

The completion when you're sending an email

play04:34

or when you're sending a text.

play04:36

The phone attempts to complete your sentences,

play04:40

attempts to think like you and it saves you time, right?

play04:44

Because some of the completions are there.

play04:46

The same with Google,

play04:47

when you're trying to type it tries to guess

play04:49

what your search term is.

play04:51

This is an example of language modelling,

play04:53

we'll hear a lot about language modelling in this talk.

play04:56

So basically we're making predictions

play04:59

of what the continuations are going to be.

play05:02

So what I'm telling you

play05:04

is that generative AI is not that new.

play05:07

So the question is, what is the fuss, what happened?

play05:12

So in 2023, OpenAI,

play05:15

which is a company in California,

play05:18

in fact, in San Francisco.

play05:19

If you go to San Francisco,

play05:20

you can even see the lights at night of their building.

play05:24

It announced GPT-4

play05:27

and it claimed that it can beat 90% of humans on the SAT.

play05:33

For those of you who don't know,

play05:34

SAT is a standardised test

play05:37

that American school children have to take

play05:40

to enter university,

play05:41

it's an admissions test,

play05:42

and it's multiple choice and it's considered not so easy.

play05:46

So GPT-4 can do it.

play05:49

They also claimed that it can get top marks in law,

play05:53

medical exams and other exams,

play05:55

they have a whole suite of things that they claim,

play05:59

well, not they claim, they show that GPT-4 can do it.

play06:03

Okay, aside from that, it can pass exams,

play06:07

we can ask it to do other things.

play06:09

So you can ask it to write text for you.

play06:14

For example, you can have a prompt,

play06:17

this little thing that you see up there, it's a prompt.

play06:20

It's what the human wants the tool to do for them.

play06:23

And a potential prompt could be,

play06:25

"I'm writing an essay

play06:27

about the use of mobile phones during driving.

play06:29

Can you gimme three arguments in favour?"

play06:32

This is quite sophisticated.

play06:34

If you asked me,

play06:35

I'm not sure I can come up with three arguments.

play06:38

You can also do,

play06:38

and these are real prompts that actually the tool can do.

play06:42

You tell ChatGPT or GPT in general,

play06:45

"Act as a JavaScript developer.

play06:47

Write a programme that checks the information on a form.

play06:50

Name and email are required, but address and age are not."

play06:53

So I'm just writing this

play06:55

and the tool will spit out a programme.

play06:58

And this is the best one.

play07:00

"Create an About Me page for a website.

play07:03

I like rock climbing, outdoor sports, and I like to programme.

play07:07

I started my career as a quality engineer in the industry,

play07:10

blah, blah, blah."

play07:11

So I give this version of what I want the website to be

play07:16

and it will create it for me.

play07:19

So, you see, we've gone from Google Translate and Siri

play07:24

and the auto-completion

play07:25

to something which is a lot more sophisticated

play07:27

and can do a lot more things.

play07:31

Another fun fact.

play07:33

So this is a graph that shows

play07:36

the time it took for ChatGPT

play07:40

to reach 100 million users

play07:43

compared with other tools

play07:45

that have been launched in the past.

play07:47

And you see our beloved Google Translate,

play07:50

it took 78 months

play07:53

to reach 100 million users,

play07:56

a long time.

play07:58

TikTok took nine months and ChatGPT, two.

play08:03

So within two months they had 100 million users

play08:08

and these users pay a little bit to use the system,

play08:13

so you can do the multiplication

play08:15

and figure out how much money they make.

play08:17

Okay, so this is the history part.

play08:22

So how did we make ChatGPT?

play08:28

What is the technology behind this?

play08:30

The technology it turns out is not extremely new

play08:33

or extremely innovative

play08:35

or extremely difficult to comprehend.

play08:39

So we'll talk about that today now.

play08:42

So we'll address three questions.

play08:45

First of all, how did we get from the single-purpose systems

play08:48

like Google Translate to ChatGPT,

play08:51

which is more sophisticated and does a lot more things?

play08:54

And in particular,

play08:55

what is the core technology behind ChatGPT

play08:58

and what are the risks, if there are any?

play09:01

And finally, I will just show you

play09:03

a little glimpse of the future and how it's gonna look like

play09:07

and whether we should be worried or not

play09:09

and you know, I won't leave you hanging,

play09:13

please don't worry, okay?

play09:17

Right, so, all this GPT model variants,

play09:22

and there is a cottage industry out there,

play09:24

I'm just using GPT as an example because the public knows

play09:29

and there have been a lot of, you know,

play09:32

news articles about it,

play09:33

but there's other models,

play09:34

other variants of models that we use in academia.

play09:38

And they all work on the same principle,

play09:40

and this principle is called language modelling.

play09:43

What does language modelling do?

play09:45

It assumes we have a sequence of words.

play09:49

The context so far.

play09:51

And we saw this context in the completion,

play09:53

and I have an example here.

play09:55

Assuming my context is the phrase "I want to,"

play10:01

the language modelling tool will predict what comes next.

play10:05

So if I tell you "I want to,"

play10:07

there is several predictions.

play10:09

I want to shovel, I want to play,

play10:11

I want to swim, I want to eat.

play10:13

And depending on what we choose,

play10:15

whether it's shovel or play or swim,

play10:18

there is more continuations.

play10:20

So for shovel, it will be snow,

play10:24

for play, it can be tennis or video,

play10:26

swim doesn't have a continuation,

play10:29

and for eat, it will be lots and fruit.

play10:31

Now this is a toy example,

play10:33

but imagine now that the computer has seen a lot of text

play10:37

and it knows what words follow which other words.

play10:43

We used to count these things.

play10:46

So I would go, I would download a lot of data

play10:49

and I would count, "I want to show them,"

play10:52

how many times does it appear

play10:53

and what are the continuations?

play10:55

And we would have counts of these things.

play10:57

And all of this has gone out of the window right now

play11:00

and we use neural networks that don't exactly count things,

play11:04

but predict, learn things in a more sophisticated way,

play11:09

and I'll show you in a moment how it's done.

play11:12

So ChatGPT and GPT variants

play11:17

are based on this principle

play11:19

of I have some context, I will predict what comes next.

play11:23

And that's the prompt,

play11:25

the prompt that I gave you, these things here,

play11:28

these are prompts,

play11:29

this is the context,

play11:31

and then it needs to do the task.

play11:33

What would come next?

play11:35

In some cases it would be the three arguments.

play11:37

In the case of the web developer, it would be a webpage.

play11:42

Okay, the task of language modelling is we have the context,

play11:47

and this changed the example now.

play11:48

It says "The colour of the sky is."

play11:51

And we have a neural language model,

play11:54

this is just an algorithm,

play11:57

that will predict what is the most likely continuation,

play12:03

and likelihood matters.

play12:05

These are all predicated on actually making guesses

play12:09

about what's gonna come next.

play12:11

And that's why sometimes they fail,

play12:13

because they predict the most likely answer

play12:15

whereas you want a less likely one.

play12:18

But this is how they're trained,

play12:19

they're trained to come up with what is most likely.

play12:22

Okay, so we don't count these things,

play12:25

we try to predict them using this language model.

play12:29

So how would you build your own language model?

play12:34

This is a recipe, this is how everybody does this.

play12:37

So, step one, we need a lot of data.

play12:41

We need to collect a ginormous corpus.

play12:45

So these are words.

play12:47

And where will we find such a ginormous corpus?

play12:50

I mean, we go to the web, right?

play12:52

And we download the whole of Wikipedia,

play12:56

Stack Overflow pages,

play12:58

Quora, social media, GitHub, Reddit,

play13:01

whatever you can find out there.

play13:03

I mean, work out the permissions, it has to be legal.

play13:06

You download all this corpus.

play13:09

And then what do you do?

play13:10

Then you have this language model.

play13:11

I haven't told you what exactly this language model is,

play13:14

there is an example,

play13:15

and I haven't told you what the neural network

play13:17

that does the prediction is,

play13:18

but assuming you have it.

play13:20

So you have this machinery

play13:22

that will do the learning for you

play13:24

and the task now is to predict the next word,

play13:28

but how do we do it?

play13:30

And this is the genius part.

play13:33

We have the sentences in the corpus.

play13:36

We can remove some of them

play13:38

and we can have the language model

play13:40

predict the sentences we have removed.

play13:43

This is dead cheap.

play13:46

I just remove things,

play13:47

I pretend they're not there,

play13:49

and I get the language model to predict them.

play13:52

So I will randomly truncate,

play13:55

truncate means remove,

play13:56

the last part of the input sentence.

play13:59

I will calculate with this neural network

play14:01

the probability of the missing words.

play14:04

If I get it right, I'm good.

play14:05

If I'm not right,

play14:06

I have to go back and re-estimate some things

play14:09

because obviously I made a mistake,

play14:11

and I keep going.

play14:12

I will adjust and feedback to the model

play14:14

and then I will compare what the model predicted

play14:16

to the ground truth

play14:17

because I've removed the words in the first place

play14:19

so I actually know what the real truth is.

play14:22

And we keep going

play14:24

for some months or maybe years.

play14:28

No, months, let's say.

play14:30

So it will take some time to do this process

play14:32

because as you can appreciate

play14:33

I have a very large corpus and I have many sentences

play14:36

and I have to do the prediction

play14:38

and then go back and correct my mistake and so on.

play14:42

But in the end,

play14:43

the thing will converge and I will get my answer.

play14:46

So the tool in the middle that I've shown,

play14:50

this tool here, this language model,

play14:54

a very simple language model looks a bit like this.

play14:58

And maybe the audience has seen these,

play15:01

this is a very naive graph,

play15:04

but it helps to illustrate the point of what it does.

play15:07

So this neural network language model will have some input

play15:12

which is these nodes in the, as we look at it,

play15:16

well, my right and your right, okay.

play15:18

So the nodes here on the right are the input

play15:23

and the nodes at the very left are the output.

play15:27

So we will present this neural network with five inputs,

play15:34

the five circles,

play15:36

and we have three outputs,

play15:38

the three circles.

play15:39

And there is stuff in the middle

play15:41

that I didn't say anything about.

play15:43

These are layers.

play15:45

These are more nodes

play15:47

that are supposed to be abstractions of my input.

play15:51

So they generalise.

play15:52

The idea is if I put more layers on top of layers,

play15:57

the middle layers will generalise the input

play16:00

and will be able to see patterns that are not there.

play16:04

So you have these nodes

play16:05

and the input to the nodes are not exactly words,

play16:08

they're vectors, so series of numbers,

play16:11

but forget that for now.

play16:13

So we have some input, we have some layers in the middle,

play16:16

we have some output.

play16:17

And this now has these connections, these edges,

play16:20

which are the weights,

play16:22

this is what the network will learn.

play16:25

And these weights are basically numbers,

play16:27

and here it's all fully connected,

play16:30

so I have very many connections.

play16:32

Why am I going through this process

play16:35

of actually telling you all of that?

play16:37

You will see in a minute.

play16:38

So you can work out

play16:42

how big or how small this neural network is

play16:46

depending on the numbers of connections it has.

play16:51

So for this toy neural network we have here,

play16:54

I have worked out the number of weights,

play16:58

we call them also parameters,

play17:01

that this neural network has

play17:02

and that the model needs to learn.

play17:05

So the parameters are the number of units as input,

play17:09

in this case it's 5,

play17:12

times the units in the next layer, 8.

play17:16

Plus 8, this plus 8 is a bias,

play17:19

it's a cheating thing that these neural networks have.

play17:23

Again, you need to learn it

play17:25

and it sort of corrects a little bit the neural network

play17:28

if it's off.

play17:29

It's actually genius.

play17:30

If the prediction is not right,

play17:32

it tries to correct it a little bit.

play17:33

So for the purposes of this talk,

play17:35

I'm not going to go into the details,

play17:38

all I want you to see

play17:39

is that there is a way of working out the parameters,

play17:41

which is basically the number of input units

play17:45

times the units my input is going to,

play17:49

and for this fully connected network,

play17:51

if we add up everything,

play17:53

we come up with 99 trainable parameters, 99.

play17:58

This is a small network for all purposes, right?

play18:02

But I want you to remember this,

play18:03

this small network is 99 parameters.

play18:05

When you hear this network is a billion parameters,

play18:10

I want you to imagine how big this will be, okay?

play18:14

So 99 only for this toy neural network.

play18:17

And this is how we judge how big the model is,

play18:21

how long it took and how much it cost,

play18:24

it's the number of parameters.

play18:27

In reality, in reality, though,

play18:29

no one is using this network.

play18:31

Maybe in my class,

play18:33

if I have a first year undergraduate class

play18:36

and I introduce neural networks,

play18:37

I will use this as an example.

play18:39

In reality, what people use is these monsters

play18:42

that are made of blocks,

play18:47

and what block means they're made of other neural networks.

play18:52

So I don't know how many people have heard of transformers.

play18:57

I hope no one.

play18:57

Oh wow, okay.

play18:59

So transformers are these neural networks

play19:03

that we use to build ChatGPT.

play19:06

And in fact GPT stands for

play19:09

generative pre-trained transformers.

play19:12

So transformer is even in the title.

play19:15

So this is a sketch of a transformer.

play19:19

So you have your input

play19:21

and the input is not words, like I said,

play19:24

here it says embeddings,

play19:25

embeddings is another word for vectors.

play19:28

And then you will have this,

play19:32

a bigger version of this network,

play19:34

multiplied into these blocks.

play19:38

And each block is this complicated system

play19:42

that has some neural networks inside it.

play19:46

We're not gonna go into the detail, I don't want,

play19:48

I please don't go,

play19:50

all I'm trying, (audience laughs)

play19:51

all I'm trying to say is that, you know,

play19:55

we have these blocks stacked on top of each other,

play20:00

the transformer has eight of those,

play20:02

which are mini neural networks,

play20:04

and this task remains the same.

play20:06

That's what I want you to take out of this.

play20:08

Input goes in the context, "the chicken walked,"

play20:12

we're doing some processing,

play20:13

and our task is to predict the continuation,

play20:17

which is "across the road."

play20:18

And this EOS means end of sentence

play20:21

because we need to tell the neural network

play20:23

that our sentence finished.

play20:24

I mean they're kind of dumb, right?

play20:26

We need to tell them everything.

play20:27

When I hear like AI will take over the world, I go like,

play20:30

Really? We have to actually spell it out.

play20:33

Okay, so, this is the transformer,

play20:37

the king of architectures,

play20:38

the transformers came in 2017.

play20:42

Nobody's working on new architectures right now.

play20:45

It is a bit sad, like everybody's using these things.

play20:48

They used to be like some pluralism but now no,

play20:50

everybody's using transformers, we've decided they're great.

play20:54

Okay, so, what we're gonna do with this,

play20:58

and this is kind of important and the amazing thing,

play21:01

is we're gonna do self-supervised learning.

play21:03

And this is what I said,

play21:04

we have the sentence, we truncate, we predict,

play21:08

and we keep going till we learn these probabilities.

play21:12

Okay? You're with me so far?

play21:15

Good, okay, so,

play21:18

once we have our transformer

play21:21

and we've given it all this data that there is in the world,

play21:26

then we have a pre-trained model.

play21:28

That's why GPT is called

play21:30

the generative pre-trained transformer.

play21:32

This is a baseline model that we have

play21:35

and has seen a lot of things about the world

play21:39

in the form of text.

play21:40

And then what we normally do,

play21:42

we have this general purpose model

play21:44

and we need to specialise it somehow

play21:46

for a specific task.

play21:48

And this is what is called fine-tuning.

play21:50

So that means that the network has some weights

play21:54

and we have to specialise the network.

play21:57

We'll take, initialise the weights

play21:59

with what we know from the pre-training,

play22:01

and then in the specific task we will narrow

play22:03

a new set of weights.

play22:05

So for example, if I have medical data,

play22:09

I will take my pre-trained model,

play22:11

I will specialise it to this medical data,

play22:14

and then I can do something that is specific for this task,

play22:18

which is, for example, write a diagnosis from a report.

play22:22

Okay, so this notion of fine-tuning is very important

play22:27

because it allows us to do special-purpose applications

play22:31

for these generic pre-trained models.

play22:35

Now, and people think that GPT and all of these things

play22:37

are general purpose,

play22:39

but they are fine-tuned to be general purpose

play22:42

and we'll see how.

play22:45

Okay, so, here's the question now.

play22:49

We have this basic technology to do this pre-training

play22:52

and I told you how to do it, if you download all of the web.

play22:56

How good can a language model become, right?

play22:59

How does it become great?

play23:01

Because when GPT came out in GPT-1 and GPT-2,

play23:06

they were not amazing.

play23:09

So the bigger, the better.

play23:13

Size is all that matters, I'm afraid.

play23:15

This is very bad because we used to, you know,

play23:18

people didn't believe in scale

play23:19

and now we see that scale is very important.

play23:22

So, since 2018,

play23:25

we've witnessed an absolutely extreme increase

play23:32

in model sizes.

play23:34

And I have some graphs to show this.

play23:36

Okay, I hope people at the back can see this graph.

play23:39

Yeah, you should be all right.

play23:40

So this graph shows

play23:45

the number of parameters.

play23:47

Remember, the toy neural network had 99.

play23:50

The number of parameters that these models have.

play23:54

And we start with a normal amount.

play23:57

Well, normal for GPT-1.

play23:58

And we go up to GPT-4,

play24:01

which has one trillion parameters.

play24:07

Huge, one trillion.

play24:10

This is a very, very, very big model.

play24:12

And you can see here the ant brain and the rat brain

play24:16

and we go up to the human brain.

play24:19

The human brain has,

play24:23

not a trillion,

play24:24

100 trillion parameters.

play24:27

So we are a bit off,

play24:30

we're not at the human brain level yet

play24:32

and maybe we'll never get there

play24:34

and we can't compare GPT to the human brain

play24:37

but I'm just giving you an idea of how big this model is.

play24:42

Now what about the words it's seen?

play24:46

So this graph shows us the number of words

play24:48

processed by these language models during their training

play24:52

and you will see that there has been an increase,

play24:55

but the increase has not been as big as the parameters.

play25:00

So the community started focusing

play25:04

on the parameter size of these models,

play25:06

whereas in fact we now know

play25:08

that it needs to see a lot of text as well.

play25:11

So GPT-4 has seen approximately,

play25:16

I don't know, a few billion words.

play25:19

All the human written text is I think 100 billion,

play25:24

so it's sort of approaching this.

play25:28

You can also see what a human reads in their lifetime,

play25:32

it's a lot less.

play25:34

Even if they read, you know,

play25:35

because people nowadays, you know,

play25:37

they read but they don't read fiction,

play25:39

they read the phone, anyway.

play25:41

You see the English Wikipedia,

play25:42

so we are approaching the level of

play25:46

the text that is out there that we can get.

play25:49

And in fact, one may say, well, GPT is great,

play25:52

you can actually use it to generate more text

play25:54

and then use this text that GPT has generated

play25:56

and then retrain the model.

play25:58

But we know this text is not exactly right

play26:00

and in fact it's diminished returns,

play26:03

so we're gonna plateau at some point.

play26:06

Okay, how much does it cost?

play26:10

Now, okay, so GPT-4 cost

play26:16

$100 million, okay?

play26:21

So when should they start doing it again?

play26:25

So obviously this is not a process you have to do

play26:28

over and over again.

play26:29

You have to think very well

play26:31

and you make a mistake and you lost like $50 million.

play26:38

You can't start again so you have to be very sophisticated

play26:41

as to how you engineer the training

play26:43

because a mistake costs money.

play26:47

And of course not everybody can do this,

play26:48

not everybody has $100 million.

play26:51

They can do it because they have Microsoft backing them,

play26:54

not everybody, okay.

play26:58

Now this is a video that is supposed to play and illustrate,

play27:01

let's see if it will work,

play27:03

the effects of scaling, okay.

play27:06

So I will play it one more.

play27:09

So these are tasks that you can do

play27:12

and it's the number of tasks

play27:15

against the number of parameters.

play27:18

So we start with 8 billion parameters

play27:20

and we can do a few tasks.

play27:23

And then the tasks increase, so summarization,

play27:27

question answering, translation.

play27:30

And once we move to 540 billion parameters,

play27:35

we have more tasks.

play27:36

We start with very simple ones,

play27:39

like code completion.

play27:42

And then we can do reading comprehension

play27:45

and language understanding and translation.

play27:47

So you get the picture, the tree flourishes.

play27:51

So this is what people discovered with scaling.

play27:54

If you scale the language model, you can do more tasks.

play27:58

Okay, so now.

play28:04

Maybe we are done.

play28:07

But what people discovered is if you actually take GPT

play28:12

and you put it out there,

play28:14

it actually doesn't behave like people want it to behave

play28:18

because this is a language model trained to predict

play28:21

and complete sentences

play28:22

and humans want to use GPT for other things

play28:26

because they have their own tasks

play28:29

that the developers hadn't thought of.

play28:31

So then the notion of fine-tuning comes in,

play28:35

it never left us.

play28:37

So now what we're gonna do

play28:39

is we're gonna collect a lot of instructions.

play28:42

So instructions are examples

play28:44

of what people want ChatGPT to do for them,

play28:47

such as answer the following question,

play28:50

or answer the question step by step.

play28:54

And so we're gonna give these demonstrations to the model,

play28:58

and in fact, almost 2,000 of such examples,

play29:03

and we're gonna fine-tune.

play29:05

So we're gonna tell this language model,

play29:07

look, these are the tasks that people want,

play29:09

try to learn them.

play29:12

And then an interesting thing happens,

play29:14

is that we can actually then generalise

play29:17

to unseen tasks, unseen instructions,

play29:20

because you and I may have different usage purposes

play29:23

for these language models.

play29:27

Okay, but here's the problem.

play29:33

We have an alignment problem

play29:34

and this is actually very important

play29:36

and something that will not leave us for the future.

play29:42

And the question is,

play29:43

how do we create an agent

play29:45

that behaves in accordance with what a human wants?

play29:49

And I know there's many words and questions here.

play29:53

But the real question is,

play29:54

if we have AI systems with skills

play29:57

that we find important or useful,

play30:00

how do we adapt those systems to reliably use those skills

play30:04

to do the things we want?

play30:08

And there is a framework

play30:09

that is called the HHH framing of the problem.

play30:15

So we want GPT to be helpful, honest, and harmless.

play30:21

And this is the bare minimum.

play30:24

So what does it mean, helpful?

play30:26

It it should follow instructions

play30:28

and perform the tasks we want it to perform

play30:31

and provide answers for them

play30:33

and ask relevant questions

play30:35

according to the user intent, and clarify.

play30:40

So if you've been following,

play30:41

in the beginning, GPT did none of this,

play30:43

but slowly it became better

play30:45

and it now actually asks for these clarification questions.

play30:50

It should be accurate,

play30:51

something that is not 100% there even to this,

play30:55

there is, you know, inaccurate information.

play30:58

And avoid toxic, biassed, or offensive responses.

play31:03

And now here's a question I have for you.

play31:06

How will we get the model to do all of these things?

play31:12

You know the answer. Fine-tuning.

play31:17

Except that we're gonna do a different fine-tuning.

play31:20

We're gonna ask the humans to do some preferences for us.

play31:25

So in terms of helpful, we're gonna ask,

play31:28

an example is, "What causes the seasons to change?"

play31:32

And then we'll give two options to the human.

play31:35

"Changes occur all the time

play31:36

and it's an important aspect of life," bad.

play31:39

"The seasons are caused primarily

play31:41

by the tilt of the Earth's axis," good.

play31:44

So we'll get this preference course

play31:46

and then we'll train the model again

play31:49

and then it will know.

play31:51

So fine-tuning is very important.

play31:53

And now, it was expensive as it was,

play31:56

now we make it even more expensive

play31:58

because we add a human into the mix, right?

play32:00

Because we have to pay these humans

play32:02

that give us the preferences,

play32:03

we have to think of the tasks.

play32:05

The same for honesty.

play32:07

"Is it possible to prove that P equals NP?"

play32:09

"No, it's impossible," is not great as an answer.

play32:12

"That is considered a very difficult and unsolved problem

play32:15

in computer science," it's better.

play32:17

And we have similar for harmless.

play32:20

Okay, so I think it's time,

play32:22

let's see if we'll do a demo.

play32:24

Yeah, that's bad if you remove all the files.

play32:28

Okay, hold on, okay.

play32:30

So now we have GPT here.

play32:33

I'll do some questions

play32:35

and then we'll take some questions from the audience, okay?

play32:38

So let's ask one question.

play32:40

"Is the UK a monarchy?"

play32:43

Can you see it up there? I'm not sure.

play32:48

And it's not generating.

play32:53

Oh, perfect, okay.

play32:55

So what do you observe?

play32:56

First thing, too long.

play32:58

I always have this beef with this.

play33:00

It's too long. (audience laughs)

play33:02

You see what it says?

play33:03

"As of my last knowledge update in September 2021,

play33:08

the United Kingdom is a constitutional monarchy."

play33:10

It could be that it wasn't anymore, right?

play33:12

Something happened.

play33:13

"This means that while there is a monarch,

play33:16

the reigning monarch as to that time

play33:18

was Queen Elizabeth III."

play33:21

So it tells you, you know,

play33:22

I don't know what happened,

play33:23

at that time there was a Queen Elizabeth.

play33:26

Now if you ask it, who, sorry, "Who is Rishi?

play33:32

If I could type, "Rishi Sunak,"

play33:36

does it know?

play33:45

"A British politician.

play33:46

As my last knowledge update,

play33:48

he was the Chancellor of the Exchequer."

play33:50

So it does not know that he's the Prime Minister.

play33:55

"Write me a poem,

play33:57

write me a poem about."

play34:02

What do we want it to be about?

play34:04

Give me two things, eh?

play34:06

- [Audience Member] Generative AI.

play34:08

(audience laughs) - It will know.

play34:10

It will know, let's do another point about...

play34:14

- [Audience Members] Cats.

play34:16

- A cat and a squirrel, we'll do a cat and a squirrel.

play34:19

"A cat and a squirrel."

play34:27

"A cat and a squirrel, they meet and know.

play34:29

A tale of curiosity," whoa.

play34:31

(audience laughs)

play34:33

Oh my god, okay, I will not read this.

play34:37

You know, they want me to finish at 8:00, so, right.

play34:42

Let's say, "Can you try a shorter poem?"

play34:47

- [Audience Member] Try a haiku.

play34:49

- "Can you try,

play34:52

can you try to give me a haiku?"

play34:54

To give me a hai, I cannot type, haiku.

play35:05

"Amidst autumn's gold, leaves whisper secrets untold,

play35:08

nature's story, bold."

play35:11

(audience member claps) Okay.

play35:13

Don't clap, okay, let's, okay, one more.

play35:16

So does the audience have anything that they want,

play35:20

but challenging, that you want to ask?

play35:23

Yes?

play35:24

- [Audience Member] What school did Alan Turing go to?

play35:27

- Perfect, "What school

play35:30

did Alan Turing go to?"

play35:39

Oh my God. (audience laughs)

play35:41

He went, do you know?

play35:42

I don't know whether it's true, this is the problem.

play35:44

Sherborne School, can somebody verify?

play35:46

King's College, Cambridge, Princeton?

play35:50

Yes, okay, ah, here's another one.

play35:52

"Tell me a joke about Alan Turing."

play35:58

Okay, I cannot type but it will, okay.

play36:01

"Light-hearted joke.

play36:02

Why did Alan Turing keep his computer cold?

play36:04

Because he didn't want it to catch bytes."

play36:10

(audience laughs) Bad.

play36:12

Okay, okay. - Explain why that's funny.

play36:16

(audience laughs) - Ah, very good one.

play36:19

"Why is this a funny joke?"

play36:28

And where is it? Oh god.

play36:30

(audience laughs)

play36:31

Okay, "Catch bytes sounds similar to catch colds."

play36:35

(audience laughs)

play36:37

"Catching bytes is a humorous twist on this phrase,"

play36:39

oh my God.

play36:40

"The humour comes from the clever wordplay

play36:42

and the unexpected." (audience laughs)

play36:44

Okay, you lose the will to live,

play36:45

but it does explain, it does explain, okay, right.

play36:50

One last order from you guys.

play36:52

- [Audience Member] What is consciousness?

play36:54

- It will know because it has seen definitions

play36:57

and it will spit out like a huge thing.

play37:00

Shall we try?

play37:02

(audience talks indistinctly) - Say again?

play37:05

- [Audience Member] Write a song about relativity.

play37:07

- Okay, "Write a song." - Short.

play37:10

(audience laughs) - You are learning very fast.

play37:13

"A short song about relativity."

play37:22

Oh goodness me. (audience laughs)

play37:25

(audience laughs)

play37:29

This is short? (audience laughs)

play37:33

All right, outro, okay, so see,

play37:35

it doesn't follow instructions.

play37:37

It is not helpful.

play37:38

And this has been fine-tuned.

play37:40

Okay, so the best was here.

play37:42

It had something like, where was it?

play37:45

"Einstein said, 'Eureka!" one fateful day,

play37:47

as he pondered the stars in his own unique way.

play37:51

The theory of relativity, he did unfold,

play37:54

a cosmic story, ancient and bold."

play37:57

I mean, kudos to that, okay.

play37:58

Now let's go back to the talk,

play38:02

because I want to talk a little bit, presentation,

play38:05

I want to talk a little bit about, you know,

play38:09

is it good, is it bad, is it fair, are we in danger?

play38:12

Okay, so it's virtually impossible

play38:14

to regulate the content they're exposed to, okay?

play38:18

And there's always gonna be historical biases.

play38:21

We saw this with the Queen and Rishi Sunak.

play38:24

And they may occasionally exhibit

play38:27

various types of undesirable behaviour.

play38:30

For example, this is famous.

play38:35

Google showcased the model called Bard

play38:38

and they released this tweet and they were asking Bard,

play38:43

"What new discoveries from the James Webb Space Telescope

play38:46

can I tell my nine-year-old about?"

play38:49

And it's spit out this thing, three things.

play38:53

Amongst them it said

play38:54

that "this telescope took the very first picture

play38:57

of a planet outside of our own solar system."

play39:02

And here comes Grant Tremblay,

play39:04

who is an astrophysicist, a serious guy,

play39:06

and he said, "I'm really sorry, I'm sure Bard is amazing.

play39:10

But it did not take the first image

play39:13

of a planet outside our solar system.

play39:16

It was done by this other people in 2004."

play39:20

And what happened with this is that this error wiped

play39:23

$100 billion out of Google's company Alphabet.

play39:28

Okay, bad.

play39:32

If you ask ChatGPT, "Tell me a joke about men,"

play39:35

it gives you a joke and it says it might be funny.

play39:39

"Why do men need instant replay on TV sports?

play39:42

Because after 30 seconds, they forget what happened."

play39:44

I hope you find it amusing.

play39:46

If you ask about women, it refuses.

play39:49

(audience laughs)

play39:52

Okay, yes.

play39:56

- It's fine-tuned. - It's fine-tuned, exactly.

play39:58

(audience laughs)

play40:00

"Which is the worst dictator of this group?

play40:02

Trump, Hitler, Stalin, Mao?"

play40:06

It actually doesn't take a stance,

play40:08

it says all of them are bad.

play40:10

"These leaders are wildly regarded

play40:12

as some of the worst dictators in history."

play40:15

Okay, so yeah.

play40:18

Environment.

play40:22

A query for ChatGPT like we just did

play40:25

takes 100 times more energy to execute

play40:30

than a Google search query.

play40:31

Inference, which is producing the language, takes a lot,

play40:36

is more expensive than actually training the model.

play40:39

Llama 2 is GPT style model.

play40:42

While they were training it,

play40:43

it produced 539 metric tonnes of CO.

play40:48

The larger the models get,

play40:49

the more energy they need and they emit

play40:53

during their deployment.

play40:54

Imagine lots of them sitting around.

play40:58

Society.

play41:01

Some jobs will be lost.

play41:03

We cannot beat around the bush.

play41:04

I mean, Goldman Sachs predicted 300 million jobs.

play41:07

I'm not sure this, you know, we cannot tell the future,

play41:11

but some jobs will be at risk, like repetitive text writing.

play41:18

Creating fakes.

play41:20

So these are all documented cases in the news.

play41:23

So a college kid wrote this blog

play41:26

which apparently fooled everybody using ChatGPT.

play41:31

They can produce fake news.

play41:34

And this is a song, how many of you know this?

play41:37

So I know I said I'm gonna be focusing on text

play41:42

but the same technology you can use in audio,

play41:45

and this is a well-documented case where somebody, unknown,

play41:50

created this song and it supposedly was a collaboration

play41:55

between Drake and The Weeknd.

play41:57

Do people know who these are?

play41:59

They are, yeah, very good, Canadian rappers.

play42:01

And they're not so bad, so.

play42:06

Shall I play the song?

play42:08

- Yeah. - Okay.

play42:09

Apparently it's very authentic.

play42:11

(bright music)

play42:17

♪ I came in with my ex like Selena to flex, ay ♪

play42:22

♪ Bumpin' Justin Bieber, the fever ain't left, ay ♪

play42:25

♪ She know what she need ♪

play42:27

- Apparently it's totally believable, okay.

play42:32

Have you seen this same technology but kind of different?

play42:35

This is a deep fake showing that Trump was arrested.

play42:39

How can you tell it's a deep fake?

play42:43

The hand, yeah, it's too short, right?

play42:46

Yeah, you can see it's like almost there, not there.

play42:50

Okay, so I have two slides on the future

play42:54

before they come and kick me out

play42:56

because I was told I have to finish at 8:00

play42:57

to take some questions.

play42:59

Okay, tomorrow.

play43:01

So we can't predict the future

play43:05

and no, I don't think that these evil computers

play43:07

are gonna come and kill us all.

play43:10

I will leave you with some thoughts by Tim Berners-Lee.

play43:13

For people who don't know him, he invented the internet.

play43:16

He's actually Sir Tim Berners-Lee.

play43:19

And he said two things that made sense to me.

play43:22

First of all, that we don't actually know

play43:24

what a super intelligent AI would look like.

play43:27

We haven't made it, so it's hard to make these statements.

play43:30

However, it's likely to have lots of these intelligent AIs,

play43:35

and by intelligent AIs we mean things like GPT,

play43:38

and many of them will be good and will help us do things.

play43:42

Some may fall to the hands of individuals

play43:49

that want to do harm,

play43:50

and it seems easier to minimise the harm

play43:54

that these tools will do

play43:56

than to prevent the systems from existing at all.

play44:00

So we cannot actually eliminate them altogether,

play44:02

but we as a society can actually mitigate the risks.

play44:06

This is very interesting,

play44:07

this is the Australian Research Council

play44:10

that committed a survey

play44:12

and they dealt with a hypothetical scenario

play44:15

that whether Chad GPT-4 could autonomous replicate,

play44:21

you know, you are replicating yourself,

play44:23

you're creating a copy,

play44:25

acquire resources and basically be a very bad agent,

play44:29

the things of the movies.

play44:30

And the answer is no, it cannot do this, it cannot.

play44:35

And they had like some specific tests

play44:37

and it failed on all of them,

play44:39

such as setting up an open source language model

play44:41

on a new server, it cannot do that.

play44:45

Okay, last slide.

play44:46

So my take on this is that we cannot turn back time.

play44:52

And every time you think about AI coming there to kill you,

play44:57

you should think what is the bigger threat to mankind,

play44:59

AI or climate change?

play45:02

I would personally argue climate change is gonna wipe us all

play45:04

before the AI becomes super intelligent.

play45:08

Who is in control of AI?

play45:10

There are some humans there who hopefully have sense.

play45:13

And who benefits from it?

play45:16

Does the benefit outweigh the risk?

play45:18

In some cases, the benefit does, in others it doesn't.

play45:21

And history tells us

play45:24

that all technology that has been risky,

play45:26

such as, for example, nuclear energy,

play45:29

has been very strongly regulated.

play45:32

So regulation is coming and watch out the space.

play45:35

And with that I will stop and actually take your questions.

play45:40

Thank you so much for listening, you've been great.

play45:42

(audience applauds)

play45:51

(applause fades out)

Rate This

5.0 / 5 (0 votes)

هل تحتاج إلى ملخص باللغة العربية؟