Lecture 1.3 — Some simple models of neurons — [ Deep Learning | Geoffrey Hinton | UofT ]

Artificial Intelligence - All in One
24 Sept 201708:24

Summary

TLDR视频脚本介绍了几种相对简单的神经元模型,从简单的线性和阈值神经元开始,逐步过渡到更复杂的模型。这些模型虽然比真实的神经元简化,但足以构建能够进行有趣机器学习任务的神经网络。视频中强调了理想化的重要性,即通过简化复杂细节,以便更好地理解和应用数学,同时避免去除关键特性。介绍了几种不同类型的神经元,包括线性神经元、二元阈值神经元、整流线性神经元、S型神经元和随机二元神经元。每种神经元模型都有其特定的计算方式和输出特性,这些特性对于理解神经网络的工作原理至关重要。视频还讨论了逻辑函数在神经元模型中的应用,以及如何通过这些模型来模拟大脑处理信息的方式。

Takeaways

  • 🧠 简化模型有助于理解复杂系统:通过简化神经元模型,可以更容易地应用数学和类比,从而理解大脑的工作原理。
  • 📈 线性神经元的局限性:线性神经元虽然简单,但在计算上受限,可能无法完全反映真实神经元的复杂性。
  • 🔄 阈值神经元的二元输出:McCulloch-Pitts阈值神经元通过计算加权和,超过阈值时输出1,否则输出0,模拟了逻辑命题的真值。
  • 📊 激活函数的重要性:激活函数如sigmoid函数提供了平滑且有界的输出,这对于机器学习中的学习过程至关重要。
  • 🔢 逻辑神经元与大脑计算:早期认为大脑计算类似于逻辑运算,但后来更关注于大脑如何结合多种不可靠证据。
  • ⚡️ 阈值神经元的两种等价表达:总输入Z可以包括或不包括偏置项,两种表达方式在数学上是等价的。
  • 📉 ReLU(修正线性单元)的非线性特性:ReLU在输入低于0时输出0,高于0时线性增长,这种特性在神经网络中非常常见。
  • 🎢 Sigmoid神经元的平滑导数:Sigmoid神经元的导数连续且平滑,这使得它们在学习算法中表现良好。
  • 🎰 随机二进制神经元的概率决策:这些神经元使用逻辑函数计算输出概率,然后基于这个概率进行随机的1或0输出。
  • 🔁 ReLU的随机性:ReLU可以决定输出的速率,但实际的尖峰产生时间是随机的,模拟了Poisson过程。
  • 🤖 神经网络的实用性:即使某些神经元模型与真实神经元行为不完全一致,它们在机器学习实践中仍然非常有用。
  • 🚀 理解简化模型的价值:即使知道某些模型可能是错误的,理解它们仍然有助于我们构建更复杂的模型,更接近现实。

Q & A

  • 视频中提到的最简单的神经元模型是什么?

    -视频中提到的最简单的神经元模型是线性神经元(linear neuron),它的输出Y是神经元偏置B和所有输入连接活动的加权和的函数。

  • McCulloch和Pitts提出的二元阈值神经元模型有哪些特点?

    -McCulloch和Pitts提出的二元阈值神经元模型首先计算输入的加权和,然后如果这个加权和超过阈值,就发出一个活动脉冲。它们将脉冲视为命题的真值,每个神经元结合来自其他神经元的真值,产生自己的真值。

  • 什么是修正线性神经元(rectified linear neuron)?

    -修正线性神经元首先计算输入的线性加权和,然后根据这个加权和的值给出输出。如果加权和Z小于零,则输出为零;如果大于零,则输出等于Z。它结合了线性神经元和二元阈值神经元的特点。

  • sigmoid神经元的输出是什么类型的函数?

    -sigmoid神经元的输出是一个连续且有界的实数值函数,通常使用logistic函数,其总输入是偏置加上输入线上的加权活动。

  • 为什么sigmoid神经元在机器学习中很有用?

    -sigmoid神经元在机器学习中很有用,因为它们的导数是连续变化的,这使得它们在进行学习时表现良好,尤其是在梯度下降等算法中。

  • 随机二元神经元是如何工作的?

    -随机二元神经元使用与logistic单元相同的方程计算总输入,并使用logistic函数计算一个实数值,该值是它们输出脉冲的概率。然后,它们不是输出这个概率作为实数,而是进行一个概率决策,实际输出是1或0,具有内在的随机性。

  • 为什么在理解复杂系统时需要理想化?

    -理想化是为了简化复杂系统,移除非本质的复杂细节,使我们能够应用数学并将其与其他熟悉的系统进行类比。这有助于我们理解主要原理,并在理解了基本原理后,逐步增加复杂性,使模型更接近现实。

  • 在机器学习中,即使知道某些模型是错误的,为什么仍然值得理解它们?

    -即使知道某些模型是错误的,理解这些模型仍然有价值,因为它们可以帮助我们理解基本原理,并且这些简化的模型在实践中可能非常有用,尤其是在机器学习的某些应用中。

  • 如何理解线性神经元的输出Y与输入的关系?

    -在线性神经元中,输出Y是输入的加权和与偏置的函数。如果我们将偏置加上加权输入活动作为x轴,输出Y作为y轴,那么可以得到一条通过原点的直线。

  • null

    -null

  • 二元阈值神经元的输入输出函数如何表示?

    -二元阈值神经元的输入输出函数可以表示为:如果加权输入Z超过阈值,则输出为1;否则输出为0。也可以表示为包含偏置项的总输入,如果总输入超过零,则输出为1,否则为0。

  • 修正线性神经元的输入输出曲线有什么特点?

    -修正线性神经元的输入输出曲线在Z小于零时输出为零,当Z大于零时输出等于Z,因此它在零点处有一个硬决策,但在Z大于零时是线性的。

  • 为什么说逻辑可能不再是大脑工作的最佳范式?

    -因为现代的神经科学研究认为大脑在处理信息时,更多地是结合了多种不同来源的不可靠证据,而不仅仅是逻辑运算。这意味着大脑的工作方式可能更复杂,涉及概率和不确定性的处理。

Outlines

00:00

😀 简单神经模型介绍

本段介绍了几种相对简单的神经元模型,从基本的线性和阈值神经元开始,逐步过渡到更复杂的模型。这些模型虽然比真实的神经元简单,但足以构建能够进行有趣机器学习任务的神经网络。作者强调了理想化的重要性,即通过简化复杂细节来理解主要原理,并通过数学和类比来加深理解。同时,作者提醒我们,在理想化过程中不应忽略关键特性。此外,还讨论了线性神经元、二元阈值神经元和修正线性神经元的工作原理,以及它们的输入输出函数。

05:03

😉 激活函数与随机二进制神经元

第二段深入讨论了不同类型的激活函数,包括sigmoid神经元和随机二进制神经元。Sigmoid神经元提供连续且有界的实数值输出,通常使用logistic函数,其导数平滑且连续,便于学习。随机二进制神经元使用与logistic神经元相同的方程式,但最终输出是1或0,取决于基于logistic函数计算出的概率进行的随机决策。此外,还提到了修正线性单元(ReLU),它在输入大于0时以确定性的方式产生输出,而输出的具体时间则是随机过程,模拟了Poisson过程。

Mindmap

Keywords

💡神经元模型

神经元模型是指在视频里描述的简化的神经元工作方式,用于模拟真实神经元的某些特性。这些模型虽然简化,但足以构建能够进行机器学习的神经网络。在视频中,提到了从简单的线性和阈值神经元到更复杂的模型,它们是理解复杂神经网络的基础。

💡线性神经元

线性神经元是一种基础的神经元模型,其输出Y是神经元偏置B和所有输入连接活动的加权和的函数。这种模型计算能力有限,但在视频中提到,它可能帮助我们洞察更复杂神经元的工作原理,尽管它可能在某些方面具有误导性。

💡阈值神经元

阈值神经元由McCulloch和Pitts引入,它们首先计算输入的加权和,如果这个加权和超过阈值,则发出一个活动脉冲。在视频中,这种神经元被视为逻辑计算的基础,其中脉冲被看作命题的真值,每个神经元结合来自其他神经元的真值来产生自己的真值。

💡修正线性神经元(ReLU)

修正线性神经元结合了线性神经元和阈值神经元的特性。它首先计算输入的线性加权和,然后根据这个加权和的值决定输出。如果加权和小于零,则输出为零;如果加权和大于零,则输出等于该加权和。这种神经元模型在视频中被提到,因其决策特性和线性特性在神经网络中非常常见。

💡sigmoid神经元

Sigmoid神经元提供实数值的输出,是其总输入的平滑且有界的函数。典型的sigmoid函数是logistic函数,其输出为1除以1加上e的负总输入次方。这种神经元的特点是具有平滑的导数,这使得它们在视频中提到的学习过程中表现良好。

💡随机二进制神经元

随机二进制神经元使用与logistic单元相同的方程来计算其总输入,并使用logistic函数计算一个实数值,该值是它们输出脉冲的概率。然而,与输出概率作为实数不同,它们实际上会做出一个随机决策,输出要么是1要么是0。这种神经元在视频中被提到,因为它们引入了内在的随机性。

💡理想化

理想化是指在视频中提到的为了理解复杂事物而进行的简化过程。通过去除非本质的复杂细节,理想化允许我们应用数学并将其与其他熟悉的系统进行类比。理想化有助于我们抓住主要原理,但同时要小心不要去除赋予其主要属性的因素。

💡机器学习

机器学习是视频中提到的一种技术,它使用简化的神经元模型来构建能够执行有趣任务的神经网络。这些模型虽然与真实神经元有所不同,但在实践中对于机器学习非常有用。

💡权重

权重是视频中提到的一个重要概念,指的是神经元输入线上活动的加权值,它影响神经元的输出。权重与输入活动相乘,然后求和,以决定神经元的激活水平。

💡偏置

偏置是神经元模型中的一个参数,它在视频中被用来调整神经元的激活阈值。偏置加在输入的加权和上,帮助决定神经元是否应该激活。

💡激活函数

激活函数是神经元模型中的一个关键组成部分,在视频中提到了多种激活函数,如sigmoid函数和修正线性单元(ReLU)。这些函数决定了神经元基于其输入和权重如何产生输出,它们引入了非线性,使得神经网络能够学习和执行更复杂的任务。

💡逻辑函数

逻辑函数,特别是在视频中提到的logistic函数,是一种S型曲线,用于sigmoid神经元中计算输出概率。它将输入映射到(0,1)区间内,这在二分类问题中非常有用。在视频中,logistic函数用于计算随机二进制神经元的输出概率。

Highlights

视频介绍了一些相对简单的神经元模型,包括线性神经元和阈值神经元等。

通过简化模型,可以去除不必要的复杂细节,帮助我们理解主要原理。

即使模型被认为错误,理解这些模型仍然有价值,只要我们记得它们的错误。

线性神经元的输出是神经元偏置和输入连接活动的加权和的函数。

阈值神经元由McCulloch和Pitts提出,它计算输入的加权和,超过阈值时发出活动脉冲。

阈值神经元将输入的逻辑值组合起来,产生自己的逻辑值。

20世纪40年代,逻辑是理解大脑如何工作的主要范式。

人们开始对大脑如何结合多种不同来源的不可靠证据感兴趣,逻辑不再是一个好的范式。

阈值神经元的输入-输出函数是,如果加权输入超过阈值,则输出1,否则输出0。

整流线性神经元结合了线性神经元和阈值神经元的特性,它首先计算输入的线性加权和,然后给出非线性输出。

当输入大于等于零时,整流线性神经元的输出等于输入值,小于零时输出为零。

Sigmoid神经元提供连续有界的实数值输出,通常使用logistic函数。

当总输入很大且为正时,sigmoid神经元的输出接近1;当总输入很大且为负时,输出接近0。

Sigmoid神经元的导数连续且易于处理,便于学习。

随机二进制神经元使用与logistic单元相同的方程,计算总输入和概率,然后随机决定输出1或0。

如果输入非常正,随机二进制神经元几乎总是产生1;如果输入非常负,则几乎总是产生0。

整流线性单元可以决定产生脉冲的速率,而实际产生脉冲的时间则是一个随机过程。

Transcripts

play00:00

in this video I'm going to describe some

play00:04

relatively simple models of neurons I'll

play00:07

describe a number of different models

play00:09

starting with simple linear and

play00:11

threshold neurons and then described in

play00:13

slightly more complicated models these

play00:16

are much simpler than real neurons but

play00:19

they're still complicated enough to

play00:21

allow us to make neural nets that do

play00:23

some very interesting kinds of machine

play00:24

learning in order to understand anything

play00:28

complicated we have to idealize it that

play00:32

is we have to make simplifications that

play00:34

allow us to get a handle on how it might

play00:37

work with atoms for example we

play00:40

simplified them as behaving like little

play00:42

solar systems idealization removes the

play00:46

complicated details that are not

play00:48

essential for understanding the main

play00:49

principles it allows us to apply

play00:52

mathematics and to make analogies to

play00:55

other familiar systems and once we

play00:57

understand the basic principles it's

play01:00

easy to add complexity and make the

play01:02

model more faithful to reality of course

play01:05

we have to be careful when we idealize

play01:07

something not to remove the thing that's

play01:10

giving it is main properties it's often

play01:14

worth understanding models that are

play01:15

known to be wrong as long as we don't

play01:17

forget that wrong so for example a lot

play01:22

of work on neural networks uses neurons

play01:24

that communicate real values rather than

play01:27

discrete spikes of activity and we know

play01:29

cortical neurons don't behave like that

play01:30

but is still worth understanding systems

play01:32

like that and in practice they can be

play01:34

very useful for machine learning the

play01:39

first kind of neuron I want to tell you

play01:40

about is the simplest it's a linear

play01:42

neuron it's simple it's computationally

play01:46

limited in what it can do it may allow

play01:49

us to get insights into more complicated

play01:50

neurons but it may be somewhat

play01:53

misleading so in a linear neuron the

play01:57

output Y is a function of a bias of the

play02:03

neuron B and the sum over all its

play02:06

incoming connections of the activity on

play02:09

an input

play02:10

time's the weight on that wine that's

play02:12

the synaptic weight on the input line

play02:14

and if we plot that as a curve then if

play02:18

we plot on the x-axis the bias plus the

play02:22

weighted activities on the input lines

play02:23

we get a straight line that goes through

play02:25

zero very different from linear neurons

play02:32

our binary threshold neurons that were

play02:34

introduced by McCulloch and Pitt's they

play02:37

actually influenced von Lohmann when he

play02:38

was thinking about how to design a

play02:40

universal computer in a binary threshold

play02:45

neuron you first compute a weighted sum

play02:47

of the inputs and then you send out a

play02:51

spike of activity if that weighted sum

play02:53

exceeds the threshold McCulloch and

play02:57

Pitt's thought that the spikes were like

play02:59

the truth values of propositions so each

play03:03

neuron is combining the truth values it

play03:05

gets from other neurons to produce the

play03:07

truth value of its own and that's like

play03:09

combining some propositions to compute

play03:12

the truth value of another proposition

play03:14

at the time in the 1940s logic was the

play03:18

main paradigm for how the mind might

play03:21

work since then people thinking about

play03:26

how the brain computes but become much

play03:28

more interested in the idea that brain

play03:30

is combining lots of different sources

play03:32

of unreliable evidence and so logic

play03:35

isn't such a good paradigm for what the

play03:37

brains up to for a binary threshold

play03:40

neuron you can think of his input-output

play03:44

function as if the weighted input is

play03:47

above the threshold it gives an output

play03:48

of 1 otherwise it gives an output of 0

play03:52

they're actually two equivalent ways to

play03:55

write the equations for a binary

play03:56

threshold neuron we can say that the

play04:00

total input Z is just the activities on

play04:03

the input lines times the weights and

play04:04

then the output Y is 1 if that Z is

play04:09

above the threshold and 0 otherwise

play04:12

alternatively we could say that the

play04:15

total input includes a bias term so the

play04:18

total input is what comes in on the

play04:19

input lines times the weights plus this

play04:22

bias term

play04:23

and then we can say the output is one if

play04:26

that total input is above zero and is

play04:29

zero otherwise

play04:30

then the equivalence is simply that the

play04:33

threshold in the first formulation is

play04:35

equal to the negative of the bias in the

play04:38

second formulation a kind of neuron that

play04:43

combines the properties of both linear

play04:45

neurons and binary threshold neurons is

play04:47

a rectified linear neuron it first

play04:52

computes a linear weighted sum of its

play04:54

inputs but then it gives an output

play04:57

there's a nonlinear function of this

play04:58

weighted sum so we compute Z in the same

play05:02

way as before if Z is below zero we give

play05:07

an output of zero otherwise we give an

play05:10

output that's equal to Z so bub zero is

play05:13

linear an ad zero it makes a hard

play05:16

decision so the input/output curve looks

play05:20

like this it's definitely not linear but

play05:24

above zero is linear so in the neuron

play05:27

like this we can get a lot of the nice

play05:28

properties of linear systems when it's

play05:31

above zero we can also get the ability

play05:33

to make decisions at zero the neurons

play05:41

that we'll use a lot in this course and

play05:44

are probably the commonest kinds of

play05:45

neurons to use in artificial neural Nets

play05:47

a sigmoid neurons they give a

play05:50

real-valued output that is a smooth and

play05:53

bounded function of their total input

play05:55

it's typical to use the logistic

play05:58

function where the total input is

play06:01

computed as before as a bias plus what

play06:03

comes in on the input lines weighted the

play06:07

output for a logistic neuron is 1 over 1

play06:10

plus e to the minus the total input if

play06:14

you think about that if the total input

play06:16

is big and positive e to the minus a big

play06:19

positive number is zero and so the

play06:22

output will be 1 if the total input is

play06:25

big in negative e to the minus a big

play06:29

negative number is a large number and so

play06:33

the output will be

play06:35

so the input-output function looks like

play06:37

this when the total input 0 e to the

play06:43

minus 0 is 1

play06:44

so the actors are 1/2 and the nice thing

play06:47

about a sigmoid is it has smooth

play06:50

derivatives the derivatives change

play06:52

continuously and so they're nicely

play06:55

behaved and they make it easy to do

play06:58

learning as we'll see in lecture 3

play07:02

finally the stochastic binary neurons

play07:05

they use just the same equations as

play07:08

logistic units they compute their total

play07:11

input the same way and they use the

play07:13

logistic function to compute a real

play07:15

value which is the probability that they

play07:17

will output a spike but then instead of

play07:20

outputting that probability as a real

play07:22

number they actually make a

play07:24

probabilistic decision and so what they

play07:26

actually output is either a 1 or a 0

play07:28

they intrinsically random so they're

play07:33

treating the P is the probability of

play07:34

producing a 1 not as a real number of

play07:38

course if the input is very big and

play07:40

positive they will almost always produce

play07:42

a 1 and if the inputs bigger negative

play07:44

that almost always produces 0 we can do

play07:49

a similar trick with rectified linear

play07:50

units we can say that the output this

play07:55

real value that comes out of a rectified

play07:57

linear unit if its input is above 0 is

play08:00

the rate of producing spikes

play08:03

so that's deterministic but once we

play08:07

figured out this rate of producing

play08:08

spikes the actual times of which spikes

play08:12

are produced is a random process as a

play08:14

Poisson process so the rectified linear

play08:16

unit determines the rate but intrinsic

play08:19

randomness in the unit determines when

play08:20

the spikes are actually produced

Rate This

5.0 / 5 (0 votes)

Related Tags
神经元模型机器学习线性神经元阈值神经元逻辑函数激活函数神经网络生物模拟计算原理理想化模型随机决策信号传递
Do you need a summary in English?