Deep Learning(CS7015): Lec 2.3 Perceptrons

NPTEL-NOC IITM
23 Oct 201810:59

Summary

TLDRThe video script delves into the Perceptron, an improvement over the McCulloch-Pitts neuron, addressing limitations like handling real-valued inputs and learning weights automatically. It introduces the concept of weights to prioritize certain inputs and a bias term to adjust thresholds. The Perceptron model, proposed by Minsky and Papert, operates by outputting 1 if the weighted sum of inputs exceeds a threshold, otherwise outputting 0. The script also discusses the Perceptron's ability to implement Boolean functions and its method for learning weights and biases through a learning algorithm, emphasizing its significance in decision-making tasks such as resource management in the oil industry.

Takeaways

  • 📚 The Perceptron model extends the concept of Boolean input by handling real-number inputs, which allows for more complex decision-making processes.
  • 🔍 Perceptrons are designed to deal with non-Boolean inputs such as salinity, pressure, and other real-world factors that influence decision-making in various industries.
  • 🧠 The introduction of weights in the Perceptron allows for the differentiation of importance among inputs, enabling the model to prioritize certain factors over others.
  • 🎓 The Perceptron learning algorithm is capable of adjusting weights and the threshold (bias) automatically based on past data or viewing experiences.
  • 📈 The Perceptron model can represent linearly separable functions, similar to the McCulloch-Pitts neuron, but with the added ability to learn and adjust its parameters.
  • 🔧 The bias (w0) in the Perceptron is often referred to as such because it represents a prior assumption or preference that influences the decision-making process.
  • ⚖️ Weights in the Perceptron model are analogous to assigning different levels of importance to various inputs, which can significantly affect the output.
  • 🤔 The Perceptron model raises questions about how to handle non-linearly separable functions and introduces the concept of a learning algorithm to address these challenges.
  • 📉 The Perceptron's decision boundary is a hyperplane that separates the input space into two regions based on weighted sums and a threshold.
  • 🔗 The Perceptron's operation can be represented by a system of linear inequalities, which can be solved to find the optimal weights and bias for a given problem.
  • 🚀 The Perceptron model was proposed by Frank Rosenblatt in 1958 and further refined by Minsky and Papert, making it a foundational concept in neural networks and machine learning.

Q & A

  • What is the main limitation of the McCulloch-Pitts neuron model discussed in the script?

    -The main limitation of the McCulloch-Pitts neuron model is that it only deals with Boolean inputs and does not have a mechanism to learn thresholds or weights for inputs.

  • What is the perceptron and how does it differ from the McCulloch-Pitts neuron?

    -The perceptron is an improvement over the McCulloch-Pitts neuron model, allowing for real-valued inputs and introducing weights for each input. It also includes a learning algorithm to determine these weights and a threshold (bias), which are not present in the McCulloch-Pitts model.

  • Why are weights important in the perceptron model?

    -Weights are important in the perceptron model because they allow the model to assign different importance to different inputs, enabling it to make decisions based on the significance of each input feature.

  • What is the role of the bias (w0) in the perceptron model?

    -The bias (w0) in the perceptron model represents the prior knowledge or assumption about the inputs. It is often called the bias because it adjusts the decision boundary, allowing the perceptron to fit the data more accurately.

  • How does the perceptron decide whether to output 1 or 0?

    -The perceptron outputs 1 if the weighted sum of the inputs is greater than a threshold (including the bias), and outputs 0 if the weighted sum is less than or equal to the threshold.

  • What is the significance of the example given about the oil mining company?

    -The example of the oil mining company illustrates the need for real-valued inputs in decision-making problems. It shows that not all problems can be solved with Boolean inputs, and that real-world applications often require considering multiple factors with continuous values.

  • What is the perceptron learning algorithm and why is it necessary?

    -The perceptron learning algorithm is a method used to adjust the weights and bias of a perceptron model. It is necessary because it allows the model to learn from data, improving its ability to make accurate predictions or classifications without manual parameter setting.

  • How does the script relate the perceptron model to the problem of predicting movie preferences?

    -The script uses the perceptron model to illustrate how one might predict movie preferences based on factors like the actor, director, and genre. It explains how weights can be assigned to these factors based on their importance to the viewer, and how the bias can adjust the decision threshold.

  • What is the significance of the term 'linearly separable' in the context of the perceptron?

    -Linearly separable refers to the ability of the perceptron to divide the input space into two distinct regions using a linear boundary. It highlights the perceptron's capability to solve problems where a clear linear decision boundary can separate different classes of inputs.

  • What are some of the challenges the script mentions when dealing with non-linearly separable functions?

    -The script mentions that even within the restricted Boolean case, there can be functions that are not linearly separable. This presents a challenge because the perceptron model, as it stands, can only solve linearly separable problems. It implies the need for more advanced models to handle complex, non-linear problems.

Outlines

00:00

🤖 Introduction to Perceptron and its Limitations

The video script begins with an introduction to the Perceptron, a computational model that extends the McCulloch-Pitts neuron by incorporating weights for each input. The speaker discusses the limitations of the McCulloch-Pitts neuron, which only handles Boolean inputs, and how the Perceptron addresses this by allowing real-valued inputs. The Perceptron also introduces the concept of learning thresholds and weights, which is a significant advancement over the previous model. The historical context is provided, mentioning that the Perceptron was proposed by Frank Rosenblatt in 1958. The script also raises questions about the capability of Perceptrons to handle non-linearly separable functions and the importance of weighing inputs differently, setting the stage for a deeper exploration of the Perceptron model.

05:03

📚 Perceptron Model and Learning Algorithm

This paragraph delves into the specifics of the Perceptron model, highlighting its operation where it outputs 1 if the weighted sum of inputs exceeds a threshold, otherwise it outputs 0. The speaker introduces a mathematical notation for the Perceptron equation and explains the concept of bias, represented by w0, which is essentially the negative of the threshold theta. The importance of weights in the Perceptron is emphasized, as they allow the model to prioritize certain inputs over others, an essential feature for making decisions based on various factors. The speaker uses the example of movie preferences to illustrate how weights can be assigned based on past viewing experiences. The paragraph also touches on the perceptron's ability to implement Boolean functions and sets the stage for discussing the Perceptron learning algorithm.

10:08

🔍 Perceptron's Function Implementation and Threshold Determination

The final paragraph of the script discusses the Perceptron's ability to implement Boolean functions and how it differs from the McCulloch-Pitts neuron. The speaker explains that the Perceptron, like the McCulloch-Pitts neuron, divides the input space into two halves, but it does so with the added capability of learning weights and a threshold. The script presents a system of linear inequalities that can be solved to determine the weights and threshold, providing an example with specific weight values that satisfy certain conditions. The speaker also points out that the McCulloch-Pitts neuron could theoretically have a similar set of conditions and inequalities, suggesting that the main distinction between the two models lies in the Perceptron's ability to learn these parameters automatically.

Mindmap

Keywords

💡Perceptron

A perceptron is a type of artificial neural network unit that processes inputs to produce a single output. It is used in machine learning for binary classifiers, where it decides between two classes by adjusting weights assigned to input features. The video discusses the perceptron's ability to handle real-valued inputs, unlike the McCulloch Pitts neuron which only deals with Boolean inputs.

💡Boolean Inputs

Boolean inputs refer to binary values, typically represented as 0 and 1, used to simplify decision-making processes. In the context of the video, the McCulloch Pitts neuron operates solely on Boolean inputs, while the perceptron can handle a wider range of inputs, such as real numbers, for more complex decision-making tasks.

💡Weights

Weights are coefficients assigned to input features in a perceptron to signify their importance. They adjust how much each input influences the output. In the video, weights are introduced to prioritize certain inputs over others, such as giving a higher weight to a director's influence in deciding whether to watch a movie.

💡Threshold (Theta)

The threshold (theta) is a critical value in a perceptron that determines whether the output will be 1 or 0 based on the weighted sum of the inputs. The video explains that unlike the McCulloch Pitts neuron where the threshold is set manually, the perceptron includes a learning algorithm to adjust the threshold and weights automatically.

💡Learning Algorithm

A learning algorithm is a method used by the perceptron to adjust weights and thresholds based on input data to improve decision-making. The video highlights how the perceptron uses past data, such as viewer history for movies, to learn and assign appropriate weights and thresholds, enhancing its predictive accuracy.

💡Real-Valued Inputs

Real-valued inputs are numerical values that are not limited to binary states and can represent a range of quantities. In the video, real-valued inputs like salinity, pressure, and density are used in more complex decision-making scenarios, such as deciding whether to drill at an oil mining station, which are beyond the capabilities of Boolean-only systems.

💡Linearly Separable

A problem is linearly separable if a single straight line can divide the input space into two distinct classes. The video discusses the limitation of perceptrons in handling only linearly separable problems and how weights and thresholds are adjusted to manage such scenarios.

💡Bias

Bias in a perceptron is the term added to the weighted sum of inputs, allowing the model to shift the decision boundary. It's often represented as the weight (w0) associated with a constant input of 1. The video explains how bias, being negative of theta, impacts the perceptron's output and helps in fine-tuning decision criteria.

💡McCulloch Pitts Neuron

The McCulloch Pitts neuron is a simple model of a neuron that processes Boolean inputs to produce an output based on a threshold. The video contrasts this with the perceptron, which introduces weights and can handle real-valued inputs, offering a more flexible and general computational model.

💡Threshold Logic Unit

A threshold logic unit is a fundamental component in neural networks that outputs a binary decision based on whether a weighted sum of inputs exceeds a threshold. The video explains how perceptrons operate as threshold logic units, extending the capabilities of simple neurons by learning weights and biases.

Highlights

Introduction to the Perceptron model, which extends beyond Boolean inputs to handle real numbers.

The Perceptron model's ability to make decisions based on various real-world factors such as pressure and salinity, unlike the McCulloch-Pitts neuron.

The limitation of McCulloch-Pitts neurons in handling only Boolean inputs and the need for a model that can process real numbers.

The concept of learning the threshold automatically in Perceptrons, as opposed to manually setting it in McCulloch-Pitts neurons.

The introduction of weights in the Perceptron model to weigh the importance of different inputs.

The Perceptron's capability to handle non-linearly separable functions, expanding its applicability beyond simple Boolean cases.

The historical context of the Perceptron, proposed by Frank Rosenblatt in 1958, and its evolution from the McCulloch-Pitts neuron.

The perceptron's learning algorithm that automatically adjusts weights and the threshold (bias) based on input data.

The practical example of using the Perceptron to decide whether to mine oil based on various environmental factors.

The explanation of how weights in the Perceptron can reflect the importance of different factors in decision-making.

The role of the bias (w0) in the Perceptron, representing a prior or baseline that influences the decision threshold.

The Perceptron's operation principle, which outputs 1 if the weighted sum of inputs exceeds the threshold, otherwise outputs 0.

The mathematical notation and formulation of the Perceptron equation, including the introduction of x0 and w0.

The discussion on why weights are necessary for implementing Boolean functions and the role of the bias in this context.

The exploration of different possible solutions for weights and biases in the Perceptron model, emphasizing the flexibility in decision boundaries.

The comparison between the Perceptron and McCulloch-Pitts neuron in terms of their ability to implement functions and the introduction of weights.

The conclusion that the Perceptron model can implement the same functions as the McCulloch-Pitts neuron but with the added capability of learning weights and biases.

Transcripts

play00:08

Now, let us go to the next module which is Perceptron.

play00:16

. So, far the story has been about Boolean input,

play00:20

but are all problems that we deal with, we are only dealing with?

play00:23

Do we always only deal with Boolean inputs?

play00:26

So, yeah so, what we spoke about is Boolean functions, right.

play00:31

Now, consider this example.

play00:32

This worked fine for a movie example where we had these as actor so much and his director

play00:37

and so on.

play00:38

But now consider the example where you are trying to decide.

play00:41

You are in oil mining company and you are trying to decide whether you should mine or

play00:46

drill at a particular station or not, right.

play00:48

Now, this could depend on various factors like what is the pressure on the surface,

play00:53

on the ocean surface at that point, what is the salinity of the water at that point, what

play00:57

is the aquatic marina aquatic life at that point and so on, right.

play01:00

So, these are not really Boolean function, right.

play01:02

The salinity is a real number, density would be a real number, pressure would be a real

play01:06

number and so on, right and this is a very valid decision problem, right.

play01:10

Companies would be interested in doing this, right.

play01:12

So, in such cases our inputs are going to be real, but so far McCulloch Pitts neuron

play01:16

only deals with boolean inputs, right.

play01:18

So, we still need to take care of that limitation.

play01:21

Now, how did we decide the threshold in all these cases?

play01:26

I just asked you, you computed it and you told me right, but that is not going to work

play01:29

out.

play01:30

I mean it does not scale to larger problems where you have many more dimensions and the

play01:33

inputs are not Boolean and so on, right.

play01:34

So, we need a way of learning this threshold.

play01:38

Ok Now, again returning to the movie example;

play01:41

maybe for me the actor is the only thing that matters and all the other inputs are not so

play01:47

important.

play01:48

Then, what do I need actually?

play01:49

I need some way of weighing these inputs, right.

play01:52

I should be able to say that this input is more important than the others, right.

play01:55

Now, I am treating all of them equal.

play01:57

I am just taking a simple sum.

play01:58

If that sum causes a threshold, I am fine otherwise I am not fine right, but maybe I

play02:02

want to raise the weight for some of these inputs or lower the weight for some of these

play02:06

inputs, right.

play02:07

So, whether it is raining outside or not maybe does not matter.

play02:09

I have a car, I could go or I could wear a jacket or an umbrella or something, right.

play02:12

So, that input is probably not so important, right.

play02:15

And what about functions which are not linearly separable, right?

play02:18

We have just been dealing with the goody goody stuff which is all linearly separable, but

play02:23

we will see that even in the restricted Boolean case, there could be some functions which

play02:27

are not linearly separable and if that is the case, how do we deal with it, right.

play02:31

So, these are some questions that we need to answer.

play02:33

Ok.

play02:34

. So, first we will start with perceptron which

play02:37

tries to fix some of these things and then, we will move forward from there.

play02:41

So, as we had discussed in the history lecture that this was proposed in 1958 by Frank Rosenblatt

play02:49

and this is what the perceptron looks like.

play02:53

Do you see any difference with the McCulloch Pitts neuron weights, right?

play02:58

You have a weight associated with each of the input otherwise everything seems, ok right.

play03:03

So, this is a more general computational model than the McCulloch Pitts neuron.

play03:08

The other interesting thing is that of course we have introduced these weights and you also

play03:15

have a mechanism for learning these weights.

play03:16

So, remember in the earlier case, our only parameter was theta which we are kind of hand

play03:21

setting right, but now with the perceptron, we will have a learning algorithm which will

play03:26

not just help us learn theta, but also these weights for the inputs, right.

play03:30

How do I know that actor is what matters or director is what matters?

play03:33

Given a lot of past viewing experience, right past given a lot of data about the movies

play03:38

which I have watched in the past, how do I know which are the weights to assign this,

play03:41

right.

play03:42

So, we will see an algorithm which will help us do that, right and the inputs are no longer

play03:46

limited to be Boolean values.

play03:47

They can be real values also, right.

play03:50

So, that is the classical perceptron, but what I am talking about here and the rest

play03:54

of the lecture is the refined version which was proposed by Minsky and Papert which is

play03:59

known as the perceptron model, right.

play04:00

So, when I say perceptron, I am referring to this model.

play04:02

So, this diagram also corresponds to that.

play04:04

Ok.

play04:05

So, now let us see what the perceptron does.

play04:09

This is how it operates.

play04:11

It will give an output of 1 if the weighted sum of the inputs is greater than a threshold,

play04:18

right.

play04:19

So, remember that in the MP neuron we did not have these weights, but now we have these

play04:25

weighted sum of the inputs and the output is going to be 0 if this weighted sum is less

play04:30

than threshold, right.

play04:32

Not very different from the MP neuron, right.

play04:34

Now, I am just going to do some trickery and try to get it to a better notation or a better

play04:40

form, right.

play04:42

So, is this ok?

play04:45

I have just taken the theta on this side ok.

play04:49

Now is this ok?

play04:52

Notice this here the indices were 1 to n.

play04:54

Now, I have made it 0 to n and the theta is suddenly disappeared.

play04:59

So, what has happened.

play05:02

w 0 is.

play05:07

Minus theta, right and x0 is 1.

play05:11

Does anyone not get this right.

play05:16

If I just start it from 1 to n, then it would be summation i equal to 1 to n wi xi plus

play05:22

w0 x0, but I am just saying w0 is equal to minus theta and x0 is equal to 1 which exactly

play05:28

gives me back this, right; So, very simple x0 equal to 1 and w0 is equal to minus theta.

play05:33

So, in effect what I am assuming is that instead of having this threshold as a separate quantity,

play05:37

I just think that that is one of my inputs which is always on and the weight of that

play05:41

input is minus theta.

play05:43

So, now the job of all these other inputs and their weights is to make sure that their

play05:48

sum is greater than this input which we have, right; does not make sense, ok fine.

play05:54

So, this is how this is the more accepted convention for writing the perceptron equation,

play06:00

right.

play06:01

So, it fires when this summation is greater than equal to 0, otherwise it does not fire,

play06:05

ok.

play06:06

. Now, let me ask a few questions, right.

play06:08

So, why are we trying to implement Boolean functions.

play06:11

I have already answered this, but I will keep repeating this question, so that it really

play06:14

gets drill in.

play06:16

Why do we need weights?

play06:18

Again we briefly touched upon that and why is w naught which is negative of theta often

play06:22

called the bias?

play06:24

So, again let us return back to the task of predicting whether you would like to watch

play06:28

a movie or not and suppose we base our decisions on three simple inputs; actor, genre and director,

play06:36

right.

play06:37

Now, based on our past viewing experience, we may give a high weight to Nolan as compared

play06:42

to the other inputs.

play06:43

So, what does that mean?

play06:45

It means that as long as the director is Christopher Nolan, I am going to watch this movie irrespective

play06:50

of who the actor is or what the genre of the movie, right.

play06:52

So, that is exactly what we want and that is the reason why we want these weights.

play06:56

Now, w0 is often called the bias as it represents the prior.

play07:01

So, now let me ask a very simple question.

play07:04

Suppose you are a movie buff.

play07:06

What would theta be?

play07:07

zero right.

play07:08

I mean you will watch any movie irrespective of who the actor, director and genre, right.

play07:12

Now, suppose you are a very niche movie watcher who only watches those movies which are which

play07:19

the genre is thriller, the director was Christopher Nolan and the actor was Damon, then what would

play07:25

your threshold be?

play07:26

3 right?

play07:27

High in this case; I always ask this question do you know of any such movie always takes

play07:33

a while.

play07:34

Interstellar so, the weights and the bias will depend on the data which in this case

play07:40

is the viewer history, right.

play07:42

So, that is the whole setup, right?

play07:43

That is why you want these weights and that is why you want these biases and that is why

play07:47

we want to learn them,

play07:49

Now, before we see whether or how we can learn these weights and biases, one question that

play07:54

we need to ask is what kind of functions can be implemented using the perceptron and are

play08:01

these function any different from the McCulloch Pitts neuron?

play08:03

So, before I go to the next slide, any guesses?

play08:07

I am hearing some interesting answers which are at least partly correct.

play08:11

ok

play08:12

. So, this is what a McCulloch Pitts neuron

play08:14

looks like and this is what a perceptron looks like.

play08:16

The only difference is this red part which is weights which has added, right.

play08:19

So, it is again clear that what the perceptron also does is, it divides the input space into

play08:25

two halves where all the points for which the output has to be one, would lie on one

play08:30

side of this plane and all the points where which the output should be 0 would lie on

play08:34

the other side of this plane, right.

play08:35

So, it is not doing anything different from what the perceptron was doing.

play08:38

So, then what is the difference?

play08:41

You have these weights and you have a mechanism for learning these weights as well as a threshold.

play08:46

We are not going to hand code them.

play08:48

ok.

play08:49

So, we will first revisit some Boolean functions and then, see the perceptron learning algorithm,

play08:54

ok.

play08:55

. So, now let us see what does the first condition,

play08:59

right.

play09:00

This condition if I actually expand it out, then this is what it turns out to be, right

play09:04

and what is that condition telling me actually w naught should be less than 0, clear.

play09:09

So, now based on these, what do you have here?

play09:13

Actually what is this?

play09:14

A system of linear inequalities, right and you know you could solve this, right.

play09:20

You have algorithms for solving this not always, but you could find some solution, right and

play09:25

one possible solution which I have given you here is w0 is equal to minus 1, w1 equal to

play09:30

1.1 and w2 equal to 1.1.

play09:32

So, just let us just draw that line, ok.

play09:34

So, what is the line?

play09:35

It is 1.1 x1 plus 1.1 x2 is equal to 1, right.

play09:41

That is the line and this is the line and you see it satisfies the conditions that I

play09:45

have is this the only solution possible.

play09:47

No, right I could have this also as a valid line.

play09:50

If I could draw properly, right all of these are valid solutions, right.

play09:55

So, which result in different w1 w naught and w 0s, ok.

play09:59

So, all of these are possible solutions.

play10:01

In fact, I have been telling you that you had to set the threshold by hand for the McCulloch

play10:07

Pitts neuron, but that is not true because you could have written similar equations there

play10:11

and then, decided what the value of theta should be, right.

play10:14

So, you could try this out for the McCulloch Pitts neuron also you will get a similar set

play10:17

of conditions or I mean similar set of inequalities and you can just say what is the value of

play10:22

theta, that you could set to solve that right.

play10:24

Ok.

play10:25

So, that ends that module.

Rate This

5.0 / 5 (0 votes)

Related Tags
PerceptronNeural NetworksBoolean FunctionsLearning AlgorithmsMcCulloch PittsThreshold DecisionReal Value InputsWeights AdjustmentData AnalysisArtificial Intelligence