Tutorial 23- Operation Of CNN(CNN vs ANN)

Krish Naik
9 Nov 201911:10

Summary

TLDRIn this 23rd tutorial of the deep learning playlist, Krishna apologizes for the delay in uploading due to technical issues and promises continuous uploads to complete the series within two weeks. The video focuses on the operations of CNNs, contrasting them with ANNs. It explains the convolution operation, the role of filters (kernels) in edge detection, and how weights are updated during backpropagation. Krishna also hints at discussing max pooling and the concept of 'location invariance' in future videos. For those interested in transitioning to data science, Krishna recommends a resource from Springboard India.

Takeaways

  • 😀 The speaker, Krishna, apologizes for the delay in uploading the 23rd tutorial in his deep learning playlist.
  • 📈 The tutorial focuses on the operations of CNNs (Convolutional Neural Networks) and their differences from ANNs (Artificial Neural Networks).
  • 🔍 Krishna explains the convolution operation, detailing how filters or kernels are used for tasks like edge detection in images.
  • 👓 The script mentions that the speaker has previously discussed padding and convolution operations in earlier tutorials.
  • 💡 The tutorial aims to clarify how weights in CNNs are updated, particularly after the convolution operation.
  • 📊 Krishna uses diagrams to contrast the operations of ANNs and CNNs, highlighting the multiplication of weights and inputs followed by an activation function in both.
  • 📚 The speaker plans to cover max pooling in an upcoming video, which is another important operation in CNNs.
  • 🧠 The script touches on the concept of 'location invariant' in CNNs, inspired by how human brains recognize features in images regardless of their location.
  • 🔗 Krishna provides a link to a resource for those interested in transitioning to data science or seeking advice from real-world data scientists.
  • 🎓 The tutorial concludes with an invitation to subscribe to the channel and a promise of more in-depth discussion on max pooling in the next video.

Q & A

  • What is the main topic of the 23rd tutorial in Krishna's deep learning playlist?

    -The main topic of the 23rd tutorial is the operations of CNN (Convolutional Neural Networks), focusing on the basic differences between ANN (Artificial Neural Networks) and CNN operations.

  • Why did Krishna skip many days before uploading this tutorial?

    -Krishna skipped many days due to some problems he had to resolve, which involved setting up his recording environment properly.

  • What is the purpose of the filters or kernels in a CNN?

    -Filters or kernels in a CNN are used for tasks such as horizontal and vertical edge detection, and they help in feature extraction from images.

  • How does the convolution operation update the filters in a CNN?

    -The convolution operation updates the filters through the backpropagation process, where the weights inside the filter are learned and updated based on the output and loss calculated.

  • What is the role of the activation function in both ANN and CNN?

    -The activation function in both ANN and CNN is used to introduce non-linearity into the model. It is applied to the output of each neuron after the weighted sum and bias addition to introduce complex patterns.

  • What is the significance of the term 'location invariant' mentioned in the script?

    -The term 'location invariant' refers to the ability of a CNN to detect features at different locations in an image, making the model robust to the position of the object within the image.

  • Why is max pooling important in CNNs according to the script?

    -Max pooling is important in CNNs because it helps in reducing the spatial dimensions of the output from the convolution layer, which in turn reduces the number of parameters and computation in the network, thus controlling overfitting.

  • What is the difference between the operations in ANN and CNN as described in the script?

    -The main difference lies in how the networks process data. While ANN uses a full connection between layers, CNN uses filters that slide over the input to perform convolution operations, which is more efficient for image data.

  • How does the script explain the process of a convolution operation with an example?

    -The script explains the convolution operation by taking an example of a 3x3 filter applied to a part of an image, demonstrating how the filter values are multiplied with the image pixels and summed up to produce a single output value in the feature map.

  • What advice does Krishna offer for those looking to transition into data science?

    -Krishna suggests visiting a channel called 'Springboard India' for discussions and advice related to transitioning into data science and becoming a real-world data scientist.

  • What is the significance of stacking multiple convolutional layers in a CNN?

    -Stacking multiple convolutional layers allows the network to learn hierarchical features, where each layer can detect increasingly complex patterns, starting from edges to more abstract features like shapes and textures.

Outlines

00:00

😀 Introduction to Deep Learning and CNN Operations

Krishna introduces himself and apologizes for the delay in uploading the 23rd tutorial in his deep learning playlist. He explains the reason for the delay was to improve the setup for the tutorials. Krishna promises to upload videos continuously to complete the deep learning series within two weeks. The video focuses on the operations of Convolutional Neural Networks (CNNs) and aims to highlight the differences between CNNs and Artificial Neural Networks (ANNs). Krishna provides a visual comparison between an ANN and a CNN, explaining the convolution operation, weight updates, and the role of filters (kernels) in detecting features like edges in images. He also mentions previous videos on padding and convolution operations and encourages viewers to seek his advice on transitioning into data science, promising more information in the video.

05:00

🔍 Deep Dive into CNN's Convolution and Backpropagation

This paragraph delves deeper into the convolution operation within CNNs, explaining how filters are used to detect features like edges in images. Krishna describes the process of applying filters to an image and how the output image is generated through convolution. He emphasizes the importance of understanding what happens after the convolution operation, particularly during the backpropagation stage where weights are updated based on the loss calculated from the output. Krishna also touches on the concept of stacking convolutional layers horizontally, like in the human visual cortex, to detect more complex features in images. He assures viewers that the basic operations in CNNs, such as weight multiplication and activation function application, are similar to those in ANNs, but adapted for image data. The paragraph concludes with a teaser for the next video, which will discuss max pooling and its significance in CNNs.

10:03

👨‍🏫 Practical Advice for Aspiring Data Scientists

In the final paragraph, Krishna shifts focus to provide practical advice for those looking to transition into data science. He recommends visiting a channel called Springboard India for insightful discussions with real-world data scientists. Krishna encourages viewers to subscribe to his channel and expresses his hope that they found the video helpful. He signs off, promising to cover the max pooling layer in his next video, and wishes viewers a great day.

Mindmap

Keywords

💡Deep Learning

Deep Learning is a subset of machine learning that focuses on artificial neural networks with multiple layers, or 'deep' architectures. In the context of the video, the speaker is discussing a tutorial series on deep learning, indicating that the video's theme revolves around teaching viewers about the intricacies of deep learning models. The script mentions that the tutorial is part of a 'complete deep learning playlist,' suggesting a comprehensive exploration of the topic.

💡Convolutional Neural Network (CNN)

A Convolutional Neural Network is a type of deep learning model predominantly used for image and video recognition tasks. The video script highlights CNN as the main subject, contrasting it with traditional artificial neural networks. The speaker discusses the operations of CNNs, emphasizing their use in detecting features such as edges in images, which is a key aspect of image processing and computer vision.

💡Artificial Neural Network (ANN)

An Artificial Neural Network is a computational model inspired by the human brain that is designed to recognize patterns. The script uses ANN as a comparative term to CNN, illustrating the fundamental differences in how they process data. The speaker explains that while ANNs multiply weights with inputs and apply activation functions, CNNs use convolution operations and filters to detect features in images.

💡Convolution Operation

Convolution Operation is a mathematical process used in CNNs to apply filters to an image. The video script describes this process in detail, explaining how it is used to detect features like horizontal or vertical edges. The speaker uses the example of a 4x4 image and a 2x2 filter to demonstrate how the convolution operation reduces the image size to 3x3, which is a fundamental concept in understanding how CNNs process images.

💡Filters or Kernels

In the context of CNNs, filters or kernels are small matrices used to apply convolution operations on an image. The script mentions that these filters are initialized to detect features like edges in images. The speaker explains that these filters are updated through backpropagation, which is crucial for the learning process in CNNs.

💡Backpropagation

Backpropagation is a method used to update the weights of a neural network by calculating the gradient of the loss function with respect to each weight. The video script refers to backpropagation as the mechanism by which CNNs learn and update their filters. The speaker emphasizes that after the convolution operation and activation function are applied, backpropagation is used to adjust the filter values to improve the model's performance.

💡Activation Function

An Activation Function is a mathematical function used in neural networks to introduce non-linear properties to the model, allowing it to learn complex patterns. The script mentions ReLU (Rectified Linear Unit) as an example of an activation function applied after the convolution operation in CNNs. The speaker explains that applying ReLU to each field in the output of the convolution operation helps introduce non-linearity, which is essential for the network to learn from the data.

💡Max Pooling

Max Pooling is a technique used in CNNs to reduce the spatial dimensions (width and height) of the input volume, making the representation more manageable and less computationally intensive. The video script hints at a future tutorial on max pooling, suggesting that it plays a significant role in CNNs by reducing the dimensionality of the output from the convolution layers, which helps in making the model less prone to overfitting and more translation invariant.

💡Padding

Padding is a technique used in CNNs where extra pixels are added around the border of an image to control the spatial size of the output feature map. The script refers to padding in the context of convolution operations, explaining that it can affect the dimensions of the output. The speaker mentions that without padding, the output size is reduced, but with padding, the original size can be maintained, which is important for preserving the spatial information in the image.

💡Data Science Transition

Data Science Transition refers to the process of moving into a career in data science, which often involves learning new skills and gaining experience in the field. The video script ends with a mention of resources for those looking to transition into data science, suggesting that the speaker is providing guidance and advice for viewers who are interested in pursuing a career in this area. The reference to 'Springboard India' implies that there are educational resources and discussions available to help viewers make this transition.

Highlights

Introduction to the 23rd tutorial in the deep learning playlist by Krishna.

Apology for the delay in uploading the tutorial due to setup issues.

Promise to upload videos continuously and complete the playlist within two weeks.

Discussion on the operations of CNNs (Convolutional Neural Networks).

Explaining the basic differences between ANN (Artificial Neural Networks) and CNN operations.

Illustration of ANN operation with an example of inputs and weights.

Explanation of the convolution operation in CNNs, including edge detection.

Description of the convolution process using a filter on an image.

Differentiation between grayscale and RGB images in terms of pixel representation.

Importance of understanding the output image after the convolution stage.

Introduction to the concept of filters and their role in feature detection.

Explanation of how filters are updated during the backpropagation stage.

Mention of applying ReLU activation function after convolution.

Discussion on stacking convolutional layers horizontally.

Explanation of the concept of 'location invariant' in CNNs.

Promotion of a resource for data science transition and advice.

Invitation to subscribe to the channel for more tutorials.

Transcripts

play00:00

hello my name is Krishna and welcome to

play00:01

my youtube channel now this is my 23rd

play00:04

tutorial on the complete deep learning

play00:06

playlist I know guys like this was like

play00:08

it I had actually skipped many days in

play00:11

order to upload this particular tutorial

play00:12

there were some problems I went I had to

play00:15

actually make up this particular setup

play00:16

and the previous setup was a little bit

play00:18

rude so now I've made this setup

play00:20

properly I'm extremely sorry for the

play00:22

delay now I'll be continuously uploading

play00:24

videos every with respect to deep

play00:26

learning and will try to complete within

play00:27

two weeks so today in this particular

play00:29

video we'll be discussing about the

play00:31

operations of CN n that is convolution

play00:33

neural network and we try to see the

play00:35

basic differences between an artificial

play00:37

neural network operation and a

play00:39

convolution neural network operation so

play00:41

in this left hand side this is basically

play00:42

my n and operation and in the right hand

play00:45

side and basically draw my CN n

play00:47

operation and we will try to discuss

play00:49

this we'll try to find out the basic

play00:51

difference then how the operation

play00:52

usually takes place in this case how the

play00:55

weights are getting updated that I have

play00:57

already made a video regarding that in

play00:58

my previous videos in previous in in my

play01:01

past twenty second and twenty first

play01:03

tutorial I have actually created a video

play01:05

on padding and actually shown you what

play01:06

is convolution operation so let us go

play01:09

ahead and try to see because after

play01:10

convolution at operation the filters

play01:13

that we are choosing in convolutional

play01:14

neural network how this is getting

play01:16

updated will try to understand that and

play01:18

one more thing guys if you are looking

play01:20

for some suggestion towards the

play01:22

transition of data science or if you

play01:24

want some advice with related to the

play01:26

real life data scientist I wish I'll be

play01:28

sharing some information so make sure

play01:30

that you watch this particular video

play01:31

till the end so let us go ahead now over

play01:34

here let me just consider an artificial

play01:36

neural network and suppose this is are

play01:38

my inputs X 1 X 2 X 3 and these are my

play01:41

weights that gets assigned W 1 W 2 W 3

play01:44

you know if you have not seen this

play01:46

videos you can completely go and check

play01:47

my playlist again the link will be given

play01:49

in the description now we know that in

play01:53

the first iteration I did but when we go

play01:55

to the first hidden neuron in the first

play01:57

layer we will multiply all the weights

play01:59

and the inputs right and I am assigning

play02:01

it as dead and after this we will also

play02:04

add a bias

play02:05

after adding bias we apply a raloo or

play02:07

some activation functions like sigmoid

play02:09

relu and

play02:10

and we have already discussed about this

play02:12

now similarly if I take the same

play02:14

operation with respect to a CNN you know

play02:16

that in CNN we have some image it may be

play02:19

of any size like it may be current in

play02:21

this particular example I have taken for

play02:23

cross flow now when I take this image I

play02:26

can initialize some filters you know

play02:29

filters or corners these are called as

play02:30

kernels now why this filter is basically

play02:32

used it is basically used to you know to

play02:34

find out the whole horizontal edge

play02:36

detection vertical edge detection many

play02:38

things like that right and we have also

play02:40

discussed about this now the convolution

play02:43

operation basically says that okay we

play02:46

have understood that how the correlation

play02:47

operation takes place we will take this

play02:49

value suppose over here I have some

play02:50

values like this 5 6 7 8 9 10 11 - and

play02:55

these are our pixels right if you know

play02:57

and if I'm considering this as a you

play02:59

know great pixel white and black images

play03:01

I'll have one pixel value right if I

play03:03

consider this as RGB image then it will

play03:06

have three channels one is our channel G

play03:08

Channel and B Channel and each and every

play03:10

field will be represented with the help

play03:12

of three values you know the R value

play03:14

suppose this R value is 128 this G value

play03:17

is something like 100 this blue value is

play03:20

basically like car 250 right so when

play03:23

this all the colors have combined we

play03:25

basically get a color image but in the

play03:27

case of grayscale image will just have 1

play03:28

pixel values now what will happen

play03:30

suppose this is my this is my filter

play03:34

okay so suppose this is my values of the

play03:37

filter and consider that this is the

play03:38

vertical filter vertical filter

play03:40

basically means if I apply this

play03:41

particular filter on to this image I'll

play03:43

be able to get all the vertical edges in

play03:45

the output image right and after that

play03:48

I've also concerned right well I have

play03:49

also told you if I take an image of 4

play03:51

cross 4 and 2 cross 2 filter is applied

play03:53

I will be getting 3 cross 3 if I am NOT

play03:57

adding any padding or stripes stripes

play03:59

basically means first of all I place

play04:01

this filter on top of it over here okay

play04:03

so this is my 3 cross 3 so this in the

play04:06

first operation I'll do this I'll get

play04:08

the value what kind of operation is

play04:10

happening convolution will try to

play04:11

multiply each and every value of this

play04:13

suppose 0 into 1 is 0 okay then in the

play04:17

second

play04:17

1 into 2 is 2 right in the 3rd field 0

play04:20

into 3 is 0 and like this all the

play04:23

addition will happen and finally we'll

play04:24

be getting the value for this particular

play04:25

field and similarly this operation will

play04:27

happen and that that usually happens in

play04:29

convolution and I have discussed this in

play04:31

my previous videos also now the most

play04:33

important thing to understand over here

play04:35

is that when we are getting the output

play04:37

image right what happens after this

play04:39

particular stage that is pretty much

play04:41

important to us understand

play04:43

and always remember guys like this kind

play04:45

of filters you know suppose this is my

play04:47

filter 1 we may also have another filter

play04:49

like f2 we may have another filter like

play04:51

f3 this filter may be an horizontal edge

play04:54

detection this may be some like shape

play04:56

detection something like this kind of

play04:58

filters will be present now you know

play05:00

that in air and we will try to update

play05:02

this weight will try to learn this

play05:04

weight with respect to the output in the

play05:06

backpropagation stage you know that

play05:08

after we get the output we calculate the

play05:10

loss and after we get the loss what we

play05:12

do we again back propagate we find out

play05:14

all the derivative is subtract all these

play05:16

weights and we update those weights

play05:18

right so similarly in this particular

play05:20

case we will try to learn these

play05:23

sweeteners we will try to update the

play05:26

values inside this particular filter

play05:28

with the help of back propagation and

play05:30

that is the trick over here all the

play05:32

concepts are same everything is same

play05:34

right there is one more like suppose I

play05:36

am getting the output after one

play05:38

convolution operation there are still

play05:40

some more operation called as Matt's

play05:41

pulling I discuss about max pooling in

play05:43

my next video

play05:44

but you need to understand that this

play05:46

filters will have to be learned by my

play05:49

convolution neural network this values

play05:51

needs to be updated inside my filter now

play05:54

once I am getting this particular output

play05:56

after this I will again go and apply

play05:58

relu activation function on each and

play06:01

every field on each and every field I

play06:05

have to apply the rail u activation

play06:07

function like how I did it over here you

play06:09

know so this was my operation right I I

play06:12

multiplied rates with my inputs in the

play06:14

NL after I got this value I applied it

play06:16

I applied an activation function so

play06:18

similarly over here first of all I

play06:20

applied filter I did my convolution

play06:22

operation I got the output after getting

play06:24

the output for each and every field I

play06:26

applied my real

play06:28

activation function right now after when

play06:31

I do the Ray Lu activation function you

play06:33

need to understand in the

play06:34

backpropagation and again I've still not

play06:36

discussed about Matt's pulling but

play06:38

Matt's pulling will also be play a very

play06:40

important role over here so in short

play06:42

when when I just go ahead and finally

play06:44

when my back propagation is done this

play06:46

all values are getting updated a similar

play06:48

case like how we did the updation over

play06:50

here considering the last function will

play06:53

be considering some optimizer same thing

play06:55

will get applied to this convolution

play06:56

neural network also so this is pretty

play06:59

much important to understand guys the

play07:01

basic difference why I am showing you a

play07:02

n and because the operation is almost

play07:04

same you know will multiply the weights

play07:07

with the inputs and then we'll try to

play07:09

apply an activation function so

play07:10

similarly over here in the convolution

play07:12

operation what will happen we will take

play07:14

this filter will apply convolution

play07:16

operation whatever output will go we

play07:18

will will apply an activation function

play07:20

each and every band is over here and

play07:22

then we'll finally get the output so

play07:25

this is the basic difference in the

play07:26

operation of CN N and an en n right

play07:29

after this you know you may also

play07:30

vertically stack any number of

play07:33

convolution so this is one convolution

play07:35

operation right considering the rail you

play07:37

activation function is one formulation

play07:39

operation and I can stack this

play07:41

horizontally you know one after the

play07:43

other now why why do I have to

play07:45

horizontally stack also you should

play07:47

understand that suppose let me take an

play07:49

example we know that in our brain right

play07:51

we have various for various regions to

play07:54

detect some images right suppose I am

play07:56

seeing a image of a cat support and I

play07:58

have explained in my previous video I

play08:00

may have layers like v1 v2 v3 v4 v5 v6

play08:05

and this is also horizontally stacked

play08:07

you know after this particular output

play08:09

will be given to v2 then it will be

play08:11

given to v3 v4 v5 and finally we will be

play08:13

able to see the image right so similarly

play08:17

in this case we can also stack this or

play08:19

conditioner Electric horizontally one

play08:22

after the other suppose in the v1

play08:25

suppose I am seeing a cat over here okay

play08:27

so this is my cat sorry for my bad

play08:30

diagram but consider that this is the

play08:32

cat okay suppose v1 is able to detect

play08:35

the face okay then

play08:38

since I have horizontally stacked it

play08:40

with another convolution over here now

play08:42

this this this face will go to the next

play08:45

stage here the more clear phase can be

play08:47

visible then inside this particular

play08:49

phase what are the features that can

play08:51

also be visible right and they may be

play08:53

different different filters for

play08:55

detecting eyes they may be different

play08:58

different filters so we are we can use

play08:59

maximum number of filters over here and

play09:02

you know that initially all this values

play09:05

inside this filter will be randomly

play09:07

selected you know some values will get

play09:09

selected and later on with the help of

play09:11

backpropagation all these values will

play09:12

get updated right so similarly we can

play09:15

stack this and this one whole operation

play09:17

is called as one cognition layer and we

play09:19

can stack this horizontally after this

play09:22

I'll also be discussing about what is

play09:24

max pooling then you learn more

play09:26

understand about why max pooling is also

play09:28

used there is a very very good term in

play09:32

one of the research paper called as

play09:33

location invariant location invariant

play09:39

right so very important term in one of

play09:42

the research paper what it says is that

play09:45

usually when our human brain see some

play09:47

faces right some of the neurons get

play09:49

automatically triggered right and

play09:52

similarly we should try to make our

play09:54

convolutional neural network also do

play09:56

like that so what I'm saying is that

play09:58

suppose in one of the image I have

play10:00

multiple cats multiple cats faces then

play10:03

automatically this convolutional neural

play10:05

network or the kernels that I'm

play10:07

basically using should get automatically

play10:10

triggered you know where it will be able

play10:12

to detect those multiple faces so that

play10:14

step will be done by the max pooling

play10:16

layer okay I will be trying to explain

play10:19

you that in my next video but just

play10:21

understand this was the basic difference

play10:22

between artificial neural network and

play10:24

the CNN I hope that diagram has got

play10:27

completely mixed up but I hope I was

play10:29

going in the right part I have explained

play10:30

in the right part just keep this in mind

play10:32

in my next video I'll be discussing

play10:34

about the max pooling layer now I'll go

play10:37

what I said in the starting life if you

play10:40

are looking for transition towards data

play10:42

science if you're looking for such

play10:43

barrier advice if you are looking for

play10:45

some help

play10:46

respect to some real-world data

play10:47

scientists talk you can basically go and

play10:49

visit this channel called a springboard

play10:51

India the link is basically given in the

play10:53

description it has wonderful discussion

play10:56

with respect to different different data

play10:57

scientists you can have and have a look

play10:59

onto that the link will basic will be

play11:01

given in their description guys so I

play11:03

hope you like this particular video

play11:04

please do subscribe on channel if you

play11:06

are not already subscribed I'll see you

play11:07

in the next video have a great day and

play11:09

given it up

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
Deep LearningCNN vs ANNConvolutional NetworksNeural NetworksEdge DetectionMachine LearningData ScienceBackpropagationImage ProcessingTutorial Series
¿Necesitas un resumen en inglés?