Deep Learning(CS7015): Lec 2.3 Perceptrons
Summary
TLDRThe video script delves into the Perceptron, an improvement over the McCulloch-Pitts neuron, addressing limitations like handling real-valued inputs and learning weights automatically. It introduces the concept of weights to prioritize certain inputs and a bias term to adjust thresholds. The Perceptron model, proposed by Minsky and Papert, operates by outputting 1 if the weighted sum of inputs exceeds a threshold, otherwise outputting 0. The script also discusses the Perceptron's ability to implement Boolean functions and its method for learning weights and biases through a learning algorithm, emphasizing its significance in decision-making tasks such as resource management in the oil industry.
Takeaways
- 📚 The Perceptron model extends the concept of Boolean input by handling real-number inputs, which allows for more complex decision-making processes.
- 🔍 Perceptrons are designed to deal with non-Boolean inputs such as salinity, pressure, and other real-world factors that influence decision-making in various industries.
- 🧠 The introduction of weights in the Perceptron allows for the differentiation of importance among inputs, enabling the model to prioritize certain factors over others.
- 🎓 The Perceptron learning algorithm is capable of adjusting weights and the threshold (bias) automatically based on past data or viewing experiences.
- 📈 The Perceptron model can represent linearly separable functions, similar to the McCulloch-Pitts neuron, but with the added ability to learn and adjust its parameters.
- 🔧 The bias (w0) in the Perceptron is often referred to as such because it represents a prior assumption or preference that influences the decision-making process.
- ⚖️ Weights in the Perceptron model are analogous to assigning different levels of importance to various inputs, which can significantly affect the output.
- 🤔 The Perceptron model raises questions about how to handle non-linearly separable functions and introduces the concept of a learning algorithm to address these challenges.
- 📉 The Perceptron's decision boundary is a hyperplane that separates the input space into two regions based on weighted sums and a threshold.
- 🔗 The Perceptron's operation can be represented by a system of linear inequalities, which can be solved to find the optimal weights and bias for a given problem.
- 🚀 The Perceptron model was proposed by Frank Rosenblatt in 1958 and further refined by Minsky and Papert, making it a foundational concept in neural networks and machine learning.
Q & A
What is the main limitation of the McCulloch-Pitts neuron model discussed in the script?
-The main limitation of the McCulloch-Pitts neuron model is that it only deals with Boolean inputs and does not have a mechanism to learn thresholds or weights for inputs.
What is the perceptron and how does it differ from the McCulloch-Pitts neuron?
-The perceptron is an improvement over the McCulloch-Pitts neuron model, allowing for real-valued inputs and introducing weights for each input. It also includes a learning algorithm to determine these weights and a threshold (bias), which are not present in the McCulloch-Pitts model.
Why are weights important in the perceptron model?
-Weights are important in the perceptron model because they allow the model to assign different importance to different inputs, enabling it to make decisions based on the significance of each input feature.
What is the role of the bias (w0) in the perceptron model?
-The bias (w0) in the perceptron model represents the prior knowledge or assumption about the inputs. It is often called the bias because it adjusts the decision boundary, allowing the perceptron to fit the data more accurately.
How does the perceptron decide whether to output 1 or 0?
-The perceptron outputs 1 if the weighted sum of the inputs is greater than a threshold (including the bias), and outputs 0 if the weighted sum is less than or equal to the threshold.
What is the significance of the example given about the oil mining company?
-The example of the oil mining company illustrates the need for real-valued inputs in decision-making problems. It shows that not all problems can be solved with Boolean inputs, and that real-world applications often require considering multiple factors with continuous values.
What is the perceptron learning algorithm and why is it necessary?
-The perceptron learning algorithm is a method used to adjust the weights and bias of a perceptron model. It is necessary because it allows the model to learn from data, improving its ability to make accurate predictions or classifications without manual parameter setting.
How does the script relate the perceptron model to the problem of predicting movie preferences?
-The script uses the perceptron model to illustrate how one might predict movie preferences based on factors like the actor, director, and genre. It explains how weights can be assigned to these factors based on their importance to the viewer, and how the bias can adjust the decision threshold.
What is the significance of the term 'linearly separable' in the context of the perceptron?
-Linearly separable refers to the ability of the perceptron to divide the input space into two distinct regions using a linear boundary. It highlights the perceptron's capability to solve problems where a clear linear decision boundary can separate different classes of inputs.
What are some of the challenges the script mentions when dealing with non-linearly separable functions?
-The script mentions that even within the restricted Boolean case, there can be functions that are not linearly separable. This presents a challenge because the perceptron model, as it stands, can only solve linearly separable problems. It implies the need for more advanced models to handle complex, non-linear problems.
Outlines
🤖 Introduction to Perceptron and its Limitations
The video script begins with an introduction to the Perceptron, a computational model that extends the McCulloch-Pitts neuron by incorporating weights for each input. The speaker discusses the limitations of the McCulloch-Pitts neuron, which only handles Boolean inputs, and how the Perceptron addresses this by allowing real-valued inputs. The Perceptron also introduces the concept of learning thresholds and weights, which is a significant advancement over the previous model. The historical context is provided, mentioning that the Perceptron was proposed by Frank Rosenblatt in 1958. The script also raises questions about the capability of Perceptrons to handle non-linearly separable functions and the importance of weighing inputs differently, setting the stage for a deeper exploration of the Perceptron model.
📚 Perceptron Model and Learning Algorithm
This paragraph delves into the specifics of the Perceptron model, highlighting its operation where it outputs 1 if the weighted sum of inputs exceeds a threshold, otherwise it outputs 0. The speaker introduces a mathematical notation for the Perceptron equation and explains the concept of bias, represented by w0, which is essentially the negative of the threshold theta. The importance of weights in the Perceptron is emphasized, as they allow the model to prioritize certain inputs over others, an essential feature for making decisions based on various factors. The speaker uses the example of movie preferences to illustrate how weights can be assigned based on past viewing experiences. The paragraph also touches on the perceptron's ability to implement Boolean functions and sets the stage for discussing the Perceptron learning algorithm.
🔍 Perceptron's Function Implementation and Threshold Determination
The final paragraph of the script discusses the Perceptron's ability to implement Boolean functions and how it differs from the McCulloch-Pitts neuron. The speaker explains that the Perceptron, like the McCulloch-Pitts neuron, divides the input space into two halves, but it does so with the added capability of learning weights and a threshold. The script presents a system of linear inequalities that can be solved to determine the weights and threshold, providing an example with specific weight values that satisfy certain conditions. The speaker also points out that the McCulloch-Pitts neuron could theoretically have a similar set of conditions and inequalities, suggesting that the main distinction between the two models lies in the Perceptron's ability to learn these parameters automatically.
Mindmap
Keywords
💡Perceptron
💡Boolean Inputs
💡Weights
💡Threshold (Theta)
💡Learning Algorithm
💡Real-Valued Inputs
💡Linearly Separable
💡Bias
💡McCulloch Pitts Neuron
💡Threshold Logic Unit
Highlights
Introduction to the Perceptron model, which extends beyond Boolean inputs to handle real numbers.
The Perceptron model's ability to make decisions based on various real-world factors such as pressure and salinity, unlike the McCulloch-Pitts neuron.
The limitation of McCulloch-Pitts neurons in handling only Boolean inputs and the need for a model that can process real numbers.
The concept of learning the threshold automatically in Perceptrons, as opposed to manually setting it in McCulloch-Pitts neurons.
The introduction of weights in the Perceptron model to weigh the importance of different inputs.
The Perceptron's capability to handle non-linearly separable functions, expanding its applicability beyond simple Boolean cases.
The historical context of the Perceptron, proposed by Frank Rosenblatt in 1958, and its evolution from the McCulloch-Pitts neuron.
The perceptron's learning algorithm that automatically adjusts weights and the threshold (bias) based on input data.
The practical example of using the Perceptron to decide whether to mine oil based on various environmental factors.
The explanation of how weights in the Perceptron can reflect the importance of different factors in decision-making.
The role of the bias (w0) in the Perceptron, representing a prior or baseline that influences the decision threshold.
The Perceptron's operation principle, which outputs 1 if the weighted sum of inputs exceeds the threshold, otherwise outputs 0.
The mathematical notation and formulation of the Perceptron equation, including the introduction of x0 and w0.
The discussion on why weights are necessary for implementing Boolean functions and the role of the bias in this context.
The exploration of different possible solutions for weights and biases in the Perceptron model, emphasizing the flexibility in decision boundaries.
The comparison between the Perceptron and McCulloch-Pitts neuron in terms of their ability to implement functions and the introduction of weights.
The conclusion that the Perceptron model can implement the same functions as the McCulloch-Pitts neuron but with the added capability of learning weights and biases.
Transcripts
Now, let us go to the next module which is Perceptron.
. So, far the story has been about Boolean input,
but are all problems that we deal with, we are only dealing with?
Do we always only deal with Boolean inputs?
So, yeah so, what we spoke about is Boolean functions, right.
Now, consider this example.
This worked fine for a movie example where we had these as actor so much and his director
and so on.
But now consider the example where you are trying to decide.
You are in oil mining company and you are trying to decide whether you should mine or
drill at a particular station or not, right.
Now, this could depend on various factors like what is the pressure on the surface,
on the ocean surface at that point, what is the salinity of the water at that point, what
is the aquatic marina aquatic life at that point and so on, right.
So, these are not really Boolean function, right.
The salinity is a real number, density would be a real number, pressure would be a real
number and so on, right and this is a very valid decision problem, right.
Companies would be interested in doing this, right.
So, in such cases our inputs are going to be real, but so far McCulloch Pitts neuron
only deals with boolean inputs, right.
So, we still need to take care of that limitation.
Now, how did we decide the threshold in all these cases?
I just asked you, you computed it and you told me right, but that is not going to work
out.
I mean it does not scale to larger problems where you have many more dimensions and the
inputs are not Boolean and so on, right.
So, we need a way of learning this threshold.
Ok Now, again returning to the movie example;
maybe for me the actor is the only thing that matters and all the other inputs are not so
important.
Then, what do I need actually?
I need some way of weighing these inputs, right.
I should be able to say that this input is more important than the others, right.
Now, I am treating all of them equal.
I am just taking a simple sum.
If that sum causes a threshold, I am fine otherwise I am not fine right, but maybe I
want to raise the weight for some of these inputs or lower the weight for some of these
inputs, right.
So, whether it is raining outside or not maybe does not matter.
I have a car, I could go or I could wear a jacket or an umbrella or something, right.
So, that input is probably not so important, right.
And what about functions which are not linearly separable, right?
We have just been dealing with the goody goody stuff which is all linearly separable, but
we will see that even in the restricted Boolean case, there could be some functions which
are not linearly separable and if that is the case, how do we deal with it, right.
So, these are some questions that we need to answer.
Ok.
. So, first we will start with perceptron which
tries to fix some of these things and then, we will move forward from there.
So, as we had discussed in the history lecture that this was proposed in 1958 by Frank Rosenblatt
and this is what the perceptron looks like.
Do you see any difference with the McCulloch Pitts neuron weights, right?
You have a weight associated with each of the input otherwise everything seems, ok right.
So, this is a more general computational model than the McCulloch Pitts neuron.
The other interesting thing is that of course we have introduced these weights and you also
have a mechanism for learning these weights.
So, remember in the earlier case, our only parameter was theta which we are kind of hand
setting right, but now with the perceptron, we will have a learning algorithm which will
not just help us learn theta, but also these weights for the inputs, right.
How do I know that actor is what matters or director is what matters?
Given a lot of past viewing experience, right past given a lot of data about the movies
which I have watched in the past, how do I know which are the weights to assign this,
right.
So, we will see an algorithm which will help us do that, right and the inputs are no longer
limited to be Boolean values.
They can be real values also, right.
So, that is the classical perceptron, but what I am talking about here and the rest
of the lecture is the refined version which was proposed by Minsky and Papert which is
known as the perceptron model, right.
So, when I say perceptron, I am referring to this model.
So, this diagram also corresponds to that.
Ok.
So, now let us see what the perceptron does.
This is how it operates.
It will give an output of 1 if the weighted sum of the inputs is greater than a threshold,
right.
So, remember that in the MP neuron we did not have these weights, but now we have these
weighted sum of the inputs and the output is going to be 0 if this weighted sum is less
than threshold, right.
Not very different from the MP neuron, right.
Now, I am just going to do some trickery and try to get it to a better notation or a better
form, right.
So, is this ok?
I have just taken the theta on this side ok.
Now is this ok?
Notice this here the indices were 1 to n.
Now, I have made it 0 to n and the theta is suddenly disappeared.
So, what has happened.
w 0 is.
Minus theta, right and x0 is 1.
Does anyone not get this right.
If I just start it from 1 to n, then it would be summation i equal to 1 to n wi xi plus
w0 x0, but I am just saying w0 is equal to minus theta and x0 is equal to 1 which exactly
gives me back this, right; So, very simple x0 equal to 1 and w0 is equal to minus theta.
So, in effect what I am assuming is that instead of having this threshold as a separate quantity,
I just think that that is one of my inputs which is always on and the weight of that
input is minus theta.
So, now the job of all these other inputs and their weights is to make sure that their
sum is greater than this input which we have, right; does not make sense, ok fine.
So, this is how this is the more accepted convention for writing the perceptron equation,
right.
So, it fires when this summation is greater than equal to 0, otherwise it does not fire,
ok.
. Now, let me ask a few questions, right.
So, why are we trying to implement Boolean functions.
I have already answered this, but I will keep repeating this question, so that it really
gets drill in.
Why do we need weights?
Again we briefly touched upon that and why is w naught which is negative of theta often
called the bias?
So, again let us return back to the task of predicting whether you would like to watch
a movie or not and suppose we base our decisions on three simple inputs; actor, genre and director,
right.
Now, based on our past viewing experience, we may give a high weight to Nolan as compared
to the other inputs.
So, what does that mean?
It means that as long as the director is Christopher Nolan, I am going to watch this movie irrespective
of who the actor is or what the genre of the movie, right.
So, that is exactly what we want and that is the reason why we want these weights.
Now, w0 is often called the bias as it represents the prior.
So, now let me ask a very simple question.
Suppose you are a movie buff.
What would theta be?
zero right.
I mean you will watch any movie irrespective of who the actor, director and genre, right.
Now, suppose you are a very niche movie watcher who only watches those movies which are which
the genre is thriller, the director was Christopher Nolan and the actor was Damon, then what would
your threshold be?
3 right?
High in this case; I always ask this question do you know of any such movie always takes
a while.
Interstellar so, the weights and the bias will depend on the data which in this case
is the viewer history, right.
So, that is the whole setup, right?
That is why you want these weights and that is why you want these biases and that is why
we want to learn them,
Now, before we see whether or how we can learn these weights and biases, one question that
we need to ask is what kind of functions can be implemented using the perceptron and are
these function any different from the McCulloch Pitts neuron?
So, before I go to the next slide, any guesses?
I am hearing some interesting answers which are at least partly correct.
ok
. So, this is what a McCulloch Pitts neuron
looks like and this is what a perceptron looks like.
The only difference is this red part which is weights which has added, right.
So, it is again clear that what the perceptron also does is, it divides the input space into
two halves where all the points for which the output has to be one, would lie on one
side of this plane and all the points where which the output should be 0 would lie on
the other side of this plane, right.
So, it is not doing anything different from what the perceptron was doing.
So, then what is the difference?
You have these weights and you have a mechanism for learning these weights as well as a threshold.
We are not going to hand code them.
ok.
So, we will first revisit some Boolean functions and then, see the perceptron learning algorithm,
ok.
. So, now let us see what does the first condition,
right.
This condition if I actually expand it out, then this is what it turns out to be, right
and what is that condition telling me actually w naught should be less than 0, clear.
So, now based on these, what do you have here?
Actually what is this?
A system of linear inequalities, right and you know you could solve this, right.
You have algorithms for solving this not always, but you could find some solution, right and
one possible solution which I have given you here is w0 is equal to minus 1, w1 equal to
1.1 and w2 equal to 1.1.
So, just let us just draw that line, ok.
So, what is the line?
It is 1.1 x1 plus 1.1 x2 is equal to 1, right.
That is the line and this is the line and you see it satisfies the conditions that I
have is this the only solution possible.
No, right I could have this also as a valid line.
If I could draw properly, right all of these are valid solutions, right.
So, which result in different w1 w naught and w 0s, ok.
So, all of these are possible solutions.
In fact, I have been telling you that you had to set the threshold by hand for the McCulloch
Pitts neuron, but that is not true because you could have written similar equations there
and then, decided what the value of theta should be, right.
So, you could try this out for the McCulloch Pitts neuron also you will get a similar set
of conditions or I mean similar set of inequalities and you can just say what is the value of
theta, that you could set to solve that right.
Ok.
So, that ends that module.
浏览更多相关视频
Unit 1.4 | The First Machine Learning Classifier | Part 2 | Making Predictions
Neuron: Building block of Deep learning
11. Implement AND function using perceptron networks for bipolar inputs and targets by Mahesh Huddar
Deep Learning(CS7015): Lec 2.5 Perceptron Learning Algorithm
Perceptron Learning Algorithm
3. Learning untuk Klasifikasi dari MACHINE LEARNING
5.0 / 5 (0 votes)