Generative Adversarial Networks (GANs) - Computerphile

Computerphile
25 Oct 201721:21

Summary

TLDRThis script delves into the fascinating world of Generative Adversarial Networks (GANs), exploring their potential for creating realistic images from random noise. It explains the adversarial training process where two networks, a discriminator and a generator, compete to improve each other's capabilities. The script also touches on the concept of latent space, where the generator's internal mapping of input to output can lead to meaningful manipulations of images, such as adding sunglasses to a portrait, showcasing GANs' remarkable ability to understand and generate complex visual data.

Takeaways

  • 🧠 Generative Adversarial Networks (GANs) are powerful models used for creating new, realistic images by learning from existing data.
  • 🎨 GANs have been applied to generate images of objects like shoes or handbags from simple sketches, although the resolution is currently limited.
  • 🔍 The process involves training a neural network to classify images, such as distinguishing between pictures of cats and dogs, by adjusting internal models based on feedback.
  • 🤖 The challenge with generative models is to not only classify but also create new samples from a given distribution, which requires understanding the underlying structure of the data.
  • 📈 GANs use a two-part system: a generator that creates new images from random noise and a discriminator that evaluates the authenticity of the images.
  • 🤝 The adversarial training process involves the generator and discriminator competing against each other, with the generator trying to fool the discriminator and the discriminator improving to detect fakes.
  • 🏋️‍♂️ Adversarial training focuses on the system's weaknesses, similar to how a teacher might focus on a student's difficulties to improve their understanding.
  • 🎲 The generator in GANs is rewarded inversely to the discriminator's judgment, encouraging it to produce images that are increasingly indistinguishable from real ones.
  • 📊 The training involves a cycle where the discriminator is given real and fake images to learn the difference, while the generator uses this feedback to improve its outputs.
  • 🛰️ The latent space of the generator, where random noise is input, is structured in a way that reflects meaningful characteristics of the images it produces, such as size, color, or features like wearing sunglasses.
  • 🔑 The structured latent space allows for intuitive manipulation, where basic arithmetic operations on the latent vectors can result in meaningful changes in the generated images, reflecting an understanding of the image features.

Q & A

  • What are generative adversarial networks (GANs)?

    -Generative adversarial networks are a class of artificial intelligence algorithms used in unsupervised machine learning, consisting of two parts: the generator, which creates new data instances, and the discriminator, which evaluates them as real or fake. They are known for their ability to generate realistic images and have various applications.

  • How do GANs generate new images?

    -GANs generate new images by using a generator network that takes random noise as input and produces an image that should resemble the data it was trained on, such as a cat or a handbag.

  • What is the purpose of the discriminator in a GAN?

    -The discriminator's role is to classify images as real or fake. It is trained to correctly identify images from the original dataset as real and those generated by the generator as fake.

  • How does adversarial training differ from traditional machine learning training?

    -Adversarial training focuses on the system's weaknesses, similar to how a teacher might focus on a student's areas of difficulty. It involves an adversarial process where one part of the system tries to fool another, forcing both to improve.

  • What is the significance of the latent space in GANs?

    -The latent space is a multi-dimensional space from which the generator draws random noise to create new images. It has the property that nearby points in this space produce similar images, which means it captures some of the structure of the images it can generate.

  • How does the generator improve over time in a GAN setup?

    -The generator improves by receiving feedback from the discriminator. If the discriminator identifies a generated image as fake, the generator adjusts its parameters to produce more realistic images that can fool the discriminator.

  • What is the concept of a min/max game in the context of GANs?

    -The min/max game refers to the competitive dynamic between the generator and discriminator. The generator aims to maximize the discriminator's error rate (by creating convincing fakes), while the discriminator aims to minimize its error rate (by correctly identifying real and fake images).

  • How can the training of a GAN be compared to teaching a child?

    -In GANs, the training process can be compared to teaching a child by focusing on the areas where the learner is struggling. Just as a teacher might focus on a child's difficulty in distinguishing between certain numbers, the GAN focuses training on the generator's weaknesses as identified by the discriminator.

  • What is the role of randomness in the generator's process?

    -Randomness is crucial in the generator's process as it provides the source of variability needed to create diverse and unique images. The generator uses this randomness to produce images that should ideally be indistinguishable from real images in the dataset.

  • Can the discriminator's feedback be used to directly improve the generator in a GAN?

    -Yes, the discriminator's feedback can be used to directly improve the generator by employing gradient descent. The generator can use the gradient of the discriminator's error to adjust its weights and produce images that are more likely to be classified as real.

  • What is the potential outcome when a GAN has been trained for a sufficient amount of time?

    -The ideal outcome of a well-trained GAN is that the generator produces images that are indistinguishable from real images, and the discriminator is unable to differentiate between real and fake images, outputting a 0.5 probability for both.

Outlines

00:00

🤖 Generative Adversarial Networks (GANs) Overview

The script introduces generative adversarial networks (GANs), highlighting their ability to generate realistic images from low-resolution sketches. It explains the concept of a neural network classifier and the limitations of such models in generating new samples. The script also touches on the challenges of creating models that can learn the underlying structure of data to generate new, plausible samples. It uses the analogy of a simple generative model learning from data points to illustrate the concept of fitting a model and the tendency of such models to produce average, unvarying outputs, which may not always be representative of the original data distribution.

05:01

🎯 Adversarial Training and Focusing on Weaknesses

This paragraph delves into the concept of adversarial training, where a machine learning system is trained by focusing on its weaknesses. The analogy of teaching a child to recognize numbers is used to explain how focusing on areas of difficulty can improve performance. The idea of creating an 'adversary' that seeks to maximize the error rate of the network is introduced, which in turn forces the network to improve and adapt. The paragraph sets the stage for the introduction of generative adversarial networks by discussing the competitive nature of training and the benefits of this approach in improving machine learning models.

10:04

🎲 The Dynamics of Generative Adversarial Networks

The script explains the architecture of generative adversarial networks, consisting of two competing networks: the discriminator and the generator. The discriminator is trained to distinguish between real and generated images, while the generator's goal is to produce images that can fool the discriminator. The process is described as a min/max game, where the discriminator aims for a low error rate, and the generator seeks a high error rate. The training process involves alternating between presenting real and fake images to the discriminator, with the generator learning from the discriminator's feedback to improve its outputs. The use of gradient descent in training the networks is also discussed, emphasizing the importance of adjusting weights to move in the direction that confuses the discriminator.

15:05

🖼️ The Evolution of Image Quality in GANs

This paragraph discusses the iterative improvement process of GANs, where both the generator and discriminator evolve to become better at their respective tasks. Initially, the generator produces poor-quality images, but as training progresses, it learns to create images that are increasingly difficult for the discriminator to distinguish from real images. The discriminator, in turn, becomes more adept at identifying fake images, leading to a cycle of mutual improvement. The theoretical endpoint of this process is a generator that produces images indistinguishable from real ones, at which point the discriminator's role is no longer necessary.

20:07

🔍 Exploring the Latent Space of GANs

The script explores the concept of the latent space in GANs, where the generator maps random noise to produce images. It describes how the latent space is structured in a way that nearby points in this space result in similar images, suggesting that the generator has learned meaningful attributes of the images it produces. The paragraph also discusses the ability to perform vector arithmetic in the latent space to generate new images with specific characteristics, such as combining features of different images to create novel ones, like a woman wearing sunglasses. This demonstrates the generator's understanding of the image attributes and its ability to generate a wide variety of images based on learned patterns.

🔐 The Implications of Non-Randomness in GANs

The final paragraph touches on the implications of the structured latent space in GANs, drawing a parallel to cryptography and the need for non-randomness to ensure reproducibility. It suggests that the ability to generate the same image repeatedly is a desirable feature, contrasting with the need for true randomness in cryptographic systems where reproducibility is not a requirement. The paragraph concludes by reinforcing the idea that GANs have developed an understanding of the features and attributes of the images they generate, as evidenced by the structured latent space and the meaningful results of vector arithmetic within it.

Mindmap

Keywords

💡Generative Adversarial Networks (GANs)

Generative Adversarial Networks are a class of artificial intelligence algorithms used in unsupervised machine learning, where two neural networks are trained simultaneously in a zero-sum game scenario. In the context of the video, GANs are highlighted for their ability to generate new, synthetic data that is similar to the training data. They are used to create images of objects like shoes or handbags from simple sketches, showcasing their impressive capability to produce high-quality images from low-resolution inputs.

💡Classifier

A classifier is a type of machine learning model that separates items into different categories. The video script describes how a neural network can be trained as a classifier to distinguish between images of cats and dogs. The classifier outputs a number between zero and one, where zero represents one class (cats) and one represents the other (dogs). The script uses this concept to illustrate the difference between a classifier, which can identify existing data, and a generative model, which can create new data.

💡Neural Network

A neural network is a set of algorithms modeled loosely after the human brain that are designed to recognize patterns. In the video, the neural network is described as forming an internal model to differentiate between images of cats and dogs. It is a foundational concept for understanding how GANs work, as both the generator and discriminator in a GAN are types of neural networks.

💡Training Samples

Training samples are the input data used to train a machine learning model. The script mentions that a neural network is trained by giving it lots of pictures of cats and dogs, which serve as training samples. These samples are crucial for the network to learn the distinguishing features of each class and improve its accuracy in classification tasks.

💡Latent Space

Latent space refers to a multi-dimensional, continuous space in which each point corresponds to a unique instance of data. In the context of GANs, the latent space is where the generator network maps random noise to create new data points, such as images. The video script explains that moving around in the latent space results in smooth variations in the generated images, suggesting that the generator has captured meaningful structures of the data.

💡Adversarial Training

Adversarial training is a machine learning paradigm where a model is trained by an adversary who is trying to make the model fail. The video script describes how this method can be used to improve the model's performance by focusing on its weaknesses. In the case of GANs, the generator and discriminator are in an adversarial relationship, with the generator trying to fool the discriminator, and the discriminator trying to correctly identify real and fake data.

💡Discriminator

In the context of GANs, the discriminator is a neural network that evaluates the authenticity of images, determining whether they are real (from the training set) or fake (generated by the generator). The script explains that the discriminator's role is to provide feedback to the generator on how to improve its output to make it more realistic.

💡Generator

The generator is the other neural network in a GAN that creates new data samples from random noise. The video script describes how the generator's goal is to produce images that the discriminator cannot distinguish from real images, thus improving its ability to generate realistic data.

💡Random Noise

Random noise in the context of GANs refers to the input data that the generator uses to create new images. The script mentions that the generator takes random noise and transforms it into images that are supposed to resemble the training data, such as images of cats. This process is essential for the generator to learn the underlying structure of the data distribution.

💡Gradient Descent

Gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of the steepest descent as defined by the negative of the gradient. In the video, gradient descent is used to train the neural networks in a GAN by adjusting the weights to minimize the error rate. The script explains that the generator uses the gradient of the discriminator's error to improve its own performance.

💡Feature Extraction

Feature extraction is the process of identifying and extracting useful information or features from data. The video script implies that through training, the GAN's generator learns to extract the underlying features of the training data, such as the characteristics of cat images, and uses this knowledge to generate new, realistic images.

Highlights

Generative Adversarial Networks (GANs) are capable of creating realistic images from sketches, such as shoes or handbags.

GANs can produce high-quality images despite current limitations in resolution.

The discriminator in GANs is trained to classify images as real or fake, outputting a number between zero and one.

The generator in GANs uses random noise to create new images that mimic the training data distribution.

GANs learn the underlying structure of the data to generate new samples from the same distribution.

The challenge of GANs lies in generating samples that are not only statistically similar but also visually plausible.

Adversarial training focuses on the system's weaknesses to improve performance.

In adversarial training, the system is continually challenged to overcome its errors and improve.

GANs consist of two networks, the discriminator and the generator, playing a competitive game against each other.

The discriminator's goal is to minimize its error rate, while the generator aims to maximize the discriminator's error.

Gradient descent is used in training GANs, with the generator being guided by the discriminator's error gradient.

The training process of GANs involves an inner cycle where the discriminator and generator alternate between real and fake images.

GANs can theoretically produce images that are indistinguishable from the real dataset after sufficient training.

The latent space of the generator in GANs has a structured mapping that corresponds to meaningful variations in the images.

Basic arithmetic in the latent space can result in meaningful changes in the generated images, such as combining features.

GANs can extract and represent structural information about the data, such as cat images, in a meaningful way.

The generator's understanding of the data allows for intuitive manipulation of image features, like wearing sunglasses or gender.

GANs have practical applications in creating new data samples that are statistically and visually consistent with the training set.

Transcripts

play00:00

So today, I thought we talk about generative adversarial networks because they're really cool, and they've

play00:07

They can do a lot of really cool things people have used them for all kinds of things

play00:11

Things like you know you draw a sketch of a shoe

play00:13

And it will render you an actual picture of a shoe or a handbag

play00:16

They're fairly low-resolution right now, but it's very impressive the way that they can produce

play00:22

real quite good-looking images

play00:27

You could make a neural network

play00:29

That's a classifier right you give it lots and lots of pictures of cats and lots and lots of pictures of dogs

play00:35

and you say you know you present it with a picture of a cat and

play00:39

It says it outputs a number. Let's say between zero and one

play00:44

and

play00:45

Zero represents cats and one represents dogs and so you give it a cat and it puts out one and you say no

play00:50

That's not right should be zero and you keep training it until eventually it can tell the difference right?

play00:55

so

play00:57

somewhere inside that

play01:01

Network

play01:02

It's... it must have formed some model of what cats are and what dogs are, at least as far as images of

play01:08

images of them are concerned

play01:11

But

play01:12

That model really... you can only really use it to classify things

play01:15

You can't say "ok draw me a new cat picture", "draw me a cat picture I haven't seen before"

play01:21

It doesn't know how to do that so quite often you want a model that can generate new

play01:28

Samples you have so you give it a bunch of samples from a particular distribution, and you want it to

play01:35

Give you more samples which are also from that same distribution, so it has to learn the underlying

play01:40

Structure of what you've given it. And that's kind of tricky, actually.

play01:46

There's a lot of...

play01:49

Well there's a lot of challenges involved in that.

play01:52

Well, let's be honest

play01:53

I don't think as a human you can find that tricky

play01:55

You know if... if I know what a cat looks like but, uh, being not the greatest artist in the world

play02:00

I'm not sure that I could draw you a decent cat. So, you know, that this is not confined to just

play02:06

Computing is it? This...

play02:07

Yeah, that's true. That's really true.

play02:10

but if you take

play02:14

Let's do like a really simple example of a generative model

play02:17

say you you give your network one thing

play02:21

It looks like this.

play02:21

And then you give it another one you're like these are your training samples looks like this

play02:26

You give it another one that looks like this, and then...

play02:29

What are those dots in the systems?

play02:33

Instances of something on two dimensions?

play02:35

Yeah, I mean right now, it's literally just data. We just... it doesn't matter what it is

play02:38

Just some... yeah, these are these are data points

play02:41

And so these are the things you're giving it, and then it will learn

play02:46

You can train it. It will learn a model, and the model it might learn is something like this, right?

play02:52

It's figured out that these dots all lie along a path, and if its model was always to draw a line

play02:58

Then it could learn by adjusting the parameters of that line

play03:01

It would move the line around until it found a line that was a good fit, and generally gave you a good prediction.

play03:06

But then if you were to ask this model:

play03:10

"Okay, now make me a new one"

play03:11

unless you did something clever, what you get is probably this, because that is on average

play03:18

The closest to any of these, because any of these dots you don't know if they're going to be above or below

play03:22

or, you know, to the left or the right. There's no pattern there. It's kind of random.

play03:26

So the best place you can go that will minimize your error, is to go just right on the line every time.

play03:33

But anybody looking at this will say: "well, that's fake"

play03:36

That's not a plausible example of something from this distribution, even though for a lot of the

play03:41

like, error functions, that people use when training networks this would perform best, so it's this interesting situation where

play03:48

There's not just one right answer.

play03:51

you know, generally speaking the way that neuron networks work is:

play03:53

you're training them towards a specific you have a label or you have a

play03:57

you have an output a target output and

play04:00

You get penalty the further away you are from that output, whereas in in a in an application like this

play04:06

There's effect... there's basically an infinite number of perfectly valid

play04:12

Outputs here

play04:13

But, so, to generate this what you actually need is to take this model and then apply some randomness, you say: "they're all

play04:21

Within, you know,

play04:25

They occur randomly and they're normally distributed around this line with this standard deviation" or whatever.

play04:30

But a lot of models would have a hard time actually

play04:33

picking one of all of the possibilities

play04:36

And they would have this tendency to kind of smooth things out and go for the average, whereas we actually just want

play04:41

"Just pick me one doesn't matter". So that's part of the problem of generating.

play04:46

Adversarial training is is help is a way of

play04:49

training

play04:52

Not just networks, actually, a way of training machine learning systems.

play04:57

Which

play04:58

involves focusing on

play05:00

the system's weaknesses.

play05:02

So, if you are learning... let's say you're teaching your

play05:08

Network to recognize handwritten digits.

play05:11

The normal way you would do that you have your big training sample of labeled samples

play05:15

You've got an array of pixels that looks like a three and then it's labeled with three and so on.

play05:19

And the normal way

play05:20

that you would train a network with this is you would just

play05:24

Present all of them pretty much at random. You'd present as many ones as two as threes and just keep throwing examples at it

play05:30

"What's this?", you know, "Yes, you got that right", "no. You've got that wrong, It should really be this".

play05:35

And keep doing that and the system will eventually learn

play05:39

but

play05:40

If you were actually teaching a person to recognize the numbers, if you were teaching a child

play05:45

you wouldn't do that, like, if you'd been teaching them for a while, presenting them and

play05:50

You know, getting the response and correcting them and so on, and you noticed that they can do...

play05:55

you know... with 2 3 4 5 6 8 & 9 they're getting like 70 80 percent

play06:02

You know, accuracy recognition rate.

play06:04

But 1 & 7 it's like 50/50, because any time they get a 1 or a 7 they just guess because they can't

play06:10

Tell the difference between them.

play06:11

If you noticed that you wouldn't keep training those other numbers, right? You would stop and say:

play06:16

"Well, You know what? we're just gonna focus on 1 & 7 because this is an issue for you".

play06:19

"I'm gonna keep showing you Ones and 7s and correcting you until

play06:23

The error rate on ones and 7s comes down to the error rate that you're getting on your other numbers".

play06:29

You're focusing the training on the area where the student is failing and

play06:34

there's kinda of a balance there when you're teaching humans

play06:37

because if you keep relentlessly focusing on their weaknesses and making them do stuff they can't do all the time

play06:43

They will just become super discouraged and give up. But neural networks don't have feelings yet, so that's really not an issue.

play06:49

You can just

play06:51

continually hammer on the weak points

play06:53

Find whatever they're having trouble with and focus on that. And so, that behavior,

play06:58

and I think some people have had teachers where it feels like this,

play07:01

It feels like an adversary, right? it feels like they want you to fail.

play07:07

So in fact

play07:08

you can make them an actual adversary. If you have some process which is genuinely

play07:14

Doing its best to make the network give as high an error as possible

play07:19

that will produce this effect where if it spots any weakness it will focus on that and

play07:24

Thereby force the learner

play07:27

To learn to not have that weakness anymore. Like one form of adversarial training people sometimes

play07:33

Do is if you have a game playing program you make it play itself a lot of times

play07:38

Because all the time. They are trying to look for weaknesses in their opponent and exploit those weaknesses and when they do that

play07:45

They're forced to then improve or fix those weaknesses in themselves because their opponent is exploiting those weaknesses, so

play07:55

Every time

play07:56

the

play07:58

Every time the system finds a strategy that is extremely good against this opponent

play08:03

The the opponent, who's also them, has to learn a way of dealing with that strategy. And so on and so on.

play08:11

So, as the system gets better it forces itself to get better

play08:18

Because it's continuously having to learn how to play a better and better opponent

play08:22

It's quite elegant, you know.

play08:25

This is where we get to generative adversarial. Networks. Let's say

play08:28

You've got a network you want to...

play08:33

Let's say you want cat pictures

play08:34

You know, you want to be able to give it a bunch of pictures of cats and have it

play08:40

Spit out a new picture of a cat that you've never seen before that looks exactly like a cat

play08:44

the way that the generative

play08:46

adversarial network works is it's this architecture where you actually have two networks one of the networks is the discriminator

play08:53

How's my spelling?

play08:55

Yeah, like that

play08:56

The discriminator Network is a classifier right it's a straightforward classifier

play09:02

You give it an image

play09:03

And it outputs a number between 0 & 1 and your training that in standard supervised learning way

play09:11

Then you have a generator and the generator

play09:15

Is...

play09:17

Usually a convolutional neural network, although actually both of these can be other processes

play09:21

But people tend to use in your networks for this.

play09:23

And the generator, you

play09:25

give it some random noise, and that's the random,

play09:29

that's where it gets its source of randomness, so

play09:31

That it can give multiple answers to the same question effectively.

play09:35

You give it some random noise and it generates an image

play09:38

From that noise and the idea is it's supposed to look like a cat

play09:43

So the way that we do this with a generative adversarial Network is it's this architecture whereby you have two networks

play09:51

Playing a game

play09:53

Effectively it's a competitive game. It's adversarial between them and in fact

play09:57

It's a very similar to the games we talked about in the Alpha go video.

play10:03

it's a min/max game

play10:05

Because these two networks are fighting over one number

play10:09

one of them wants the number to be high one of them wants the number to be low.

play10:14

And what that number actually is is the error rate of the discriminator?

play10:19

so

play10:22

The discriminator

play10:23

Wants a low error rate the generator wants a high error rate the discriminators job is to look at an image

play10:30

which could have come from the original data set or

play10:34

It could have come from the generator and its job is to say yes. This is a real image or no. This is a fake

play10:42

any outputs a number between 0 & 1 like 1 for its real and 0 for its fake for example and

play10:48

the generator

play10:50

Gets fed as its input. Just some random noise and it then generates an image from that and

play10:57

it's

play10:58

Reward you know it's training is

play11:01

Pretty much the inverse of what the discriminator says

play11:04

for that image so if it produces an image

play11:08

Which the discriminator can immediately tell this fake? It gets a negative reward you know it's a

play11:14

That's it's trained not to do that if it manages to produce an image that the discriminator

play11:18

Can't tell is fake

play11:21

Then that's really good so you train them in a inner cycle effectively you you give the discriminator a real image

play11:29

get its output, then you generate a fake image and get the discriminator that and

play11:36

Then you give it a real so the discriminator gets alternating real image fake image real image fake image

play11:41

usually I mean there are things you can do where you

play11:43

Train them at different rates and whatever but by default they're generally to get any help with this at all, or is it purely

play11:50

Yes, so if you this is this is like part of what makes this especially clever actually

play11:56

the generator does get help because

play11:59

if

play12:00

You set up the networks right you can use the gradient of the discriminator

play12:06

to train the generator

play12:09

so when I

play12:11

Know you done back propagation before about how neural networks are trained its gradient descent right and in fact we talked about this in like

play12:17

2014 sure if you were a

play12:21

You're a blind person climbing a mountain or you're it's really foggy, and you're climbing a mountain you can only see directly

play12:26

What's underneath your own feet?

play12:29

You can still climb that mountain if you just follow the gradient you just look directly under me which way is the

play12:35

You know which way is the ground sloping? This is what we did the hill climb algorithm exactly

play12:40

Yeah, sometimes people call it hill climbing sometimes people call it gradient descent

play12:45

It's the same

play12:46

metaphor

play12:48

Upside down effectively if we're climbing up or we're climbing down you're training them by gradient descent, which means that

play12:55

You're not just you're not just able to say

play12:59

Yes, that's good. No. That's bad

play13:01

You're actually able to say and you should adjust yours you should adjust your weights in this direction so that you'll move down the gradient

play13:08

right

play13:10

So generally you're trying to move down the gradient of error for the network

play13:16

If you're like if you're training if your training the thing to just recognize cats and dogs you're just moving it

play13:23

You're moving it down the gradient towards the correct label whereas in this case

play13:30

The generator is being moved

play13:33

sort of up the gradient for the discriminators error

play13:38

So it can find out not just you did well you did badly

play13:42

But here's how to tweak your weights so that you will so that the discriminator would have been more wrong

play13:48

So so that you can confuse the discriminator more so you can think of this whole thing?

play13:54

An analogy people sometimes use is like a a forger and

play14:00

An expert investigator person right at the beginning, you know let's assume

play14:06

There's one forger in there's one investigator and all of the art

play14:10

buyers of the world are idiots at the beginning the the

play14:18

Level of the the quality of the forgeries is going to be quite low right the guy

play14:21

Just go get some paint, and he he then he just writes you know Picasso on it

play14:27

And he can sell it for a lot of money and the investigator comes along and says yeah

play14:31

I do I don't know that's right or maybe it is. I'm not sure I haven't really figured it out

play14:36

And then as time goes on the investigator who's the discriminator will?

play14:42

Start to spot certain things that are different between the things that the forger produces and real paintings

play14:47

And then they'll start to be able to reliably spot. Oh, this is a fake

play14:51

You know this uses the wrong type of paint or whatever

play14:53

So it's fake and once that happens the forger is forced to get better right you can't sell his fakes anymore

play15:01

He has to find that kind of paint

play15:03

So he goes and you know

play15:04

Digs up Egyptian mummies or whatever to get the legit paint and now he can forge again

play15:09

and now of the discriminator the investigator is fooled and

play15:12

They have to find a new thing

play15:15

That distinguishes the real from the fakes and so on and so on in a cycle they force each other to improve

play15:19

And it's the same thing here

play15:23

So at the beginning the generator is making just random noise basically because it's it's it's getting random noisy

play15:29

And it's doing something to it who knows what and it spits out an image and the discriminator goes that looks nothing like a cat

play15:36

you know

play15:37

and

play15:39

then

play15:40

eventually because the discriminator is also not very smart at the beginning right and

play15:45

And they just they both get better and better

play15:48

The generator gets better at producing cat looking things and the discriminator gets better and better at identifying them

play15:55

until eventually in principle if you run this for long enough theoretically you end up with a situation where the

play16:02

Generator is creating images that look exactly

play16:07

Indistinguishable from

play16:10

Images from the real data set and the discriminator if it's given a real image, or a fake image always outputs 0.5

play16:19

5050 I

play16:20

Don't know could be either these things are literally indistinguishable, then you pretty much can throw away the discriminator

play16:26

And you've got a generator, which you give random noise to and it outputs

play16:31

brand-new

play16:32

Indistinguishable images of cats there's another cool thing about this

play16:36

Which is every every time we ask the generator to generate new image

play16:41

We're giving it some random data, right we give it just this vector of random numbers

play16:47

Which you can think of as being a randomly selected point in a space because you know if you give it

play16:56

If you give it ten random numbers you know between zero and one or whatever that is effectively a point in a 10 dimensional space

play17:04

and

play17:05

the thing that's cool is that as

play17:08

the generator learns

play17:10

It's forced to

play17:13

You if the generator is effectively making a mapping from that space into cat pictures

play17:19

This is called the lateness base by the way generally

play17:23

Any two nearby points in that latent space will when you put them through the generator produce similar

play17:29

cabbages you know similar pictures in general

play17:33

Which means sort of as you move

play17:36

Around if you sort of take that point and smoothly move it around the latent space you get a smooth léa varying

play17:44

picture of a cat and so the directions you can move in the space

play17:48

Actually end up

play17:50

corresponding to

play17:52

Something that we as humans might consider meaningful about cats

play17:56

so there's one you know there's one direction, and it's not necessarily one dimension of the space or whatever but

play18:02

And it's not necessarily linear or a straight line or anything

play18:05

But there will be a direction in that space which corresponds to

play18:09

How big the cat is in the frame for example or another dimension will be the color of the cat or?

play18:15

whatever

play18:17

so

play18:18

That's really cool, because it means that

play18:22

by

play18:24

Intuitively you think

play18:26

The fact that the generator can reliably produce a very large number of images of cats

play18:31

means it must have some like understanding understanding of

play18:36

What cats are right or at least what images of cats are

play18:41

And it's nice to see that it has actually

play18:45

Structured its latent space in this way that it's by looking at a huge number of pictures of cats it has actually extracted

play18:52

some of the structure of

play18:56

cat pictures in general

play18:58

In a way, which is meaningful when you look at it?

play19:02

So and that means you can do some really cool things, so one example was they trained Annette one of these systems

play19:08

on a really large

play19:12

Database of just face photographs and so it could generate

play19:17

arbitrarily large number of well as largest the input space a number of different faces and

play19:23

So they found that actually by doing

play19:27

basic

play19:28

arithmetic like just adding and subtracting vectors on the

play19:33

Latent space would actually produce meaningful changes in the image if you took a bunch of latent

play19:41

vectors, which when you give them to the generator produce pictures of men and

play19:45

a bunch of them that produce pictures of women and average those you get a point in your

play19:53

latent space which

play19:54

corresponds to

play19:56

a picture of a man or a picture of a woman which is not one of your input points, but it's sort of representative and

play20:03

Then you could do the same thing and say oh, I only want

play20:06

Give me the average point of all of the things that correspond to pictures of men wearing sunglasses

play20:13

right and

play20:15

Then if you take your sunglass vector, you're you're men wearing sunglasses vector

play20:22

Subtract the man vector and add the woman vector you get a point in your space

play20:28

And if you run that through the generator you get a woman wearing sunglasses

play20:33

right

play20:34

So doing doing basic vector arithmetic in your input space actually is?

play20:40

Meaningful in terms of images in a way that humans would recognize, which means that?

play20:45

There's there's a sense in which the generator really does

play20:49

Have an understanding of wearing sunglasses or not or being a man or being a woman

play20:55

Which is kind of an impressive result

play20:59

All the way along

play21:00

But it's not a truly random thing because if I know the key and I can start want to generate the same

play21:05

Yeah

play21:05

I'm so I mean that's about

play21:07

Unfortunate is the problem with cryptography is that we couldn't ever use truly random because we wouldn't be able to decrypt it again

play21:11

We have our message bits, which are you know naught 1 1 naught something different?

play21:15

And we XOR these together one bit at a time, and that's how we encrypt

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
Generative Adversarial NetworksImage GenerationDeep LearningArtificial IntelligenceNeural NetworksMachine LearningData ScienceClassifier ModelsAdversarial TrainingInnovation Tech
هل تحتاج إلى تلخيص باللغة الإنجليزية؟