FRM: Normal probability distribution

Bionic Turtle
15 Jan 200809:21

Summary

TLDRDavid Harper from the Bionic Turtle offers an insightful tutorial on the normal distribution, a fundamental concept in probability. He explains the distribution's characteristics, including its bell curve shape, symmetry, and the two parameters defining it: mean and standard deviation. Harper also discusses the misuse of the normal distribution in financial asset returns due to their fat-tailed nature and positive skew, contrasting it with the normal distribution's thin tails. He concludes by highlighting the distribution's prevalence due to its simplicity and the central limit theorem.

Takeaways

  • πŸ“š The normal distribution, also known as the bell curve or Gaussian distribution, is the most common and familiar probability distribution.
  • πŸ“ˆ It's often used to describe asset returns, despite being a misuse due to its symmetrical and clean nature compared to the actual fat-tailed distributions of financial returns.
  • πŸ“Š The normal distribution is characterized by its symmetry and the presence of two parameters: mean and standard deviation, which fully describe the distribution.
  • πŸ“‰ The script illustrates the mismatch between the normal distribution and actual stock returns, such as Google's, which are positively skewed and have fatter tails.
  • πŸ”’ The normal distribution's density function is implemented with the formula involving Ο€ and e, and parameters for mean and standard deviation.
  • πŸ“ The script demonstrates how to plot the normal distribution using both an Excel function and a manual formula, showing their equivalence.
  • πŸ”§ By adjusting the mean and standard deviation, the script shows how the normal distribution curve shifts without changing its shape, highlighting the distribution's behavior.
  • 🌐 The normal distribution is fully described by only two parameters, making it convenient for various applications.
  • πŸ“‹ The script explains the '68-95-99.7' rule, which describes the proportion of the area under the normal distribution curve within one, two, and three standard deviations from the mean.
  • πŸ“˜ The normal distribution is continuous, symmetrical, and has a kurtosis of three, indicating no excess kurtosis or fat tails.
  • 🌟 The prevalence of the normal distribution is due to its simplicity and the central limit theorem, which states that the sampling mean of samples tends to be normally distributed, emphasizing its importance in statistics rather than in risk management.

Q & A

  • What is the normal distribution also known as?

    -The normal distribution is also known as the normal bell curve and occasionally referred to as a Gaussian distribution.

  • Why is the normal distribution sometimes misused to describe asset returns?

    -The normal distribution is sometimes misused to describe asset returns due to its convenience and elegant properties, despite being inappropriate for financial returns because it does not accurately represent their characteristics.

  • What does the histogram of Google's stock price daily returns show about the fit of the normal distribution?

    -The histogram of Google's stock price daily returns shows that the normal distribution does not fit well, as there is a positive skew and the presence of outlier returns, indicating fat-tailed distributions.

  • What is meant by the term 'leptokurtosis' in the context of asset returns?

    -Leptokurtosis refers to the phenomenon where asset returns have fatter tails than the normal distribution, meaning there is a higher likelihood of extreme values occurring.

  • What are the two parameters needed for the normal distribution function?

    -The two parameters needed for the normal distribution function are the mean and the standard deviation, which determine the peak and the dispersion of the curve, respectively.

  • What is the standard normal distribution and what are its parameters?

    -The standard normal distribution is a specific case of the normal distribution with a mean of zero and a standard deviation of one.

  • How does changing the mean affect the normal distribution curve?

    -Changing the mean shifts the peak of the normal distribution curve along the x-axis without altering the shape of the curve.

  • How does changing the standard deviation affect the normal distribution curve?

    -Changing the standard deviation affects the width of the normal distribution curve, with a larger standard deviation resulting in a wider, more dispersed curve.

  • What are the 'rules of thumb' regarding the area under the normal distribution curve?

    -The rules of thumb state that approximately 68% of the area under the curve is within one standard deviation of the mean, 95.5% within two standard deviations, and 99.7% within three standard deviations.

  • What are the characteristics of the normal distribution?

    -The characteristics of the normal distribution include being continuous, fully described by two parameters (mean and standard deviation), symmetrical with a skew of zero, and having no excess kurtosis (kurtosis of three).

  • Why is the normal distribution so common in statistics?

    -The normal distribution is common due to its simplicity with only two parameters and the central limit theorem, which states that the sampling mean of samples tends to become normally distributed.

Outlines

00:00

πŸ“Š Introduction to the Normal Distribution

David Harper introduces the normal distribution, commonly known as the bell curve or Gaussian distribution, emphasizing its prevalence in probability theory. He explains the misuse of the normal distribution in financial asset returns, using Google's stock price as an example to illustrate the positive skew and fat-tailed nature of actual returns. Harper demonstrates the normal distribution's symmetrical and clean curve, contrasting it with the lepto kurtosis of financial returns, and introduces the normal distribution function and its parameters: mean and standard deviation.

05:01

πŸ“š Characteristics and Properties of the Normal Distribution

This paragraph delves into the characteristics of the normal distribution, highlighting its full description by only two parameters: the mean and standard deviation. Harper discusses the implications of changing these parameters on the distribution's curve, emphasizing the distribution's behavior and the concept of lepto kurtosis. He also outlines the empirical rules of thumb related to the area under the normal curve, such as the percentages of area covered by one, two, and three standard deviations. Harper concludes by explaining why the normal distribution is so common, attributing it to its simplicity and the central limit theorem, which states that the sampling mean of samples tends to become normally distributed.

Mindmap

Keywords

πŸ’‘Normal Distribution

The normal distribution, also known as the Gaussian distribution or bell curve, is a probability distribution that is symmetrical and defined by two parameters: the mean and the standard deviation. It is central to the video's theme as it is used to describe the distribution of many phenomena, including asset returns, despite its limitations. The script mentions the normal distribution's prevalence due to its convenient properties and the central limit theorem.

πŸ’‘Bell Curve

The term 'bell curve' is a colloquial reference to the normal distribution, characterized by its symmetrical, bell-shaped graph. In the video, the bell curve is used to illustrate the distribution of Google's stock returns, highlighting the discrepancy between the theoretical normal distribution and actual financial data.

πŸ’‘Gaussian Distribution

The Gaussian distribution is a technical term for the normal distribution, named after the mathematician Carl Friedrich Gauss. It is mentioned in the script as an alternative name for the normal distribution, emphasizing its historical and technical significance in statistics and probability theory.

πŸ’‘Asset Returns

Asset returns refer to the profits or losses generated by an investment over a specific period. The script discusses the misuse of the normal distribution to model asset returns, pointing out that while it is a convenient model, it does not accurately represent the true distribution of financial returns due to their fat-tailed nature.

πŸ’‘Histogram

A histogram is a graphical representation used to show the distribution of data, with bins representing ranges of values and heights representing frequencies. In the video, the script describes plotting Google's stock returns on a histogram to visually compare it with the normal distribution.

πŸ’‘Skewness

Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable. The script notes that Google's stock returns exhibit positive skewness, meaning the distribution is shifted to the right, which is a deviation from the symmetrical normal distribution.

πŸ’‘Leptokurtosis

Leptokurtosis refers to the phenomenon where the tails of a probability distribution are 'fat' compared to the normal distribution. The script uses this term to describe the characteristic of financial returns, which have fatter tails than the normal distribution, indicating a higher likelihood of extreme values.

πŸ’‘Volatility

Volatility in the context of finance refers to the degree of variation of a trading price series over time. The script discusses volatility as a parameter of the normal distribution, which measures the dispersion of returns around the mean and is crucial for understanding the distribution's shape.

πŸ’‘Standard Deviation

Standard deviation is a measure of the amount of variation or dispersion in a set of values. In the video, it is one of the two parameters defining the normal distribution, indicating how spread out the data is from the mean and affecting the distribution's width.

πŸ’‘Cumulative Distribution Function (CDF)

The cumulative distribution function (CDF) describes the probability that a random variable with a given distribution is found at a value less than or equal to x. The script mentions the CDF in the context of the normal distribution, explaining how the area under the curve represents the probability of the data falling within certain ranges.

πŸ’‘Central Limit Theorem

The central limit theorem is a statistical theory that states that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the original distribution. The script explains this theorem as a reason for the prevalence of the normal distribution in statistical analysis.

πŸ’‘Kurtosis

Kurtosis is a measure of the 'tailedness' of the probability distribution. The script contrasts the normal distribution's kurtosis, which is three (indicating thin tails), with the leptokurtosis observed in financial returns, which have fatter tails and thus higher kurtosis.

Highlights

The normal distribution, also known as the bell curve or Gaussian distribution, is the most common and familiar probability distribution.

Despite its misuse, the normal distribution is often used to describe asset returns due to its convenient properties.

Financial returns, such as Google's stock, typically do not match the normal distribution due to positive skew and fat tails.

The normal distribution function is represented by a formula involving pi and the natural number e.

The normal distribution is characterized by two parameters: the mean and the standard deviation.

Excel's built-in function 'NORM.DIST' can be used to calculate the normal distribution.

Changing the mean and standard deviation in the normal distribution alters the curve's position and dispersion but not its shape.

The normal distribution is fully described by only two parameters, making it convenient for use.

Rules of thumb for the normal distribution include percentages of area under the curve within one, two, and three standard deviations.

The normal distribution is continuous, symmetrical, and has a kurtosis of three, indicating no excess kurtosis or fat tails.

The prevalence of the normal distribution is due to its simplicity and the central limit theorem, which states that the sampling mean of samples tends to be normally distributed.

The central limit theorem does not necessarily imply special significance in risk management, as it pertains to central tendency rather than tail risks.

The normal distribution's properties make it a popular choice despite its limitations in accurately representing financial asset returns.

David Harper of the Bionic Turtle provides a tutorial on the normal distribution, emphasizing its characteristics and common applications.

The tutorial includes a practical demonstration using Google's stock returns to illustrate the mismatch with the normal distribution.

The video explains the mathematical formula for the normal distribution and its implementation in Excel.

The tutorial concludes by discussing the limitations of the normal distribution in risk metrics and the importance of understanding its properties.

Transcripts

play00:05

hello this is David Harper of the Bionic

play00:07

turtle with a brief tutorial on the

play00:09

normal distribution easily the most

play00:11

common popular and familiar probability

play00:15

distribution of all probability

play00:16

distributions the normal is also called

play00:19

the normal bell curve and occasionally

play00:21

we hear it referred to as a Gaussian

play00:24

distribution that's Gaussian with a G a

play00:26

Gaussian distribution is the same thing

play00:28

as a normal it's just a little more

play00:29

technical term okay to show you the

play00:32

normal curve let me show you in practice

play00:33

oftentimes we'll use it to describe

play00:37

asset returns even as we use it we

play00:40

typically know that it's a misuse that

play00:43

it's sort of inappropriate to financial

play00:45

returns but it's so convenient that

play00:48

sometimes we'll use it as sheer

play00:49

convenience it has elegant and

play00:51

convenient properties but just to show

play00:53

you why it's a misuse here I pulled

play00:57

periodic returns for Google's stock

play00:59

price this ending mid December of 2007

play01:02

so these are the daily periodic returns

play01:05

that I've then plotted onto a histogram

play01:06

or frequency chart and so you might say

play01:09

it first glance well that looks like the

play01:11

normal bell curve and if I collected

play01:13

more data I would definitely get a

play01:14

smoother distribution but if I

play01:17

superimpose the normal density function

play01:21

so here's that bell curve plotted in

play01:22

blue if it's for most financial asset

play01:26

returns I'm not going to get a good

play01:28

match so it Google's pretty typical here

play01:31

in the sense that there's not a match

play01:32

you can see the normal curve is

play01:35

symmetrical and clean whereas my Google

play01:39

returns well first of all they are

play01:40

positively skewed see how there's a

play01:42

positive skew over to the right leaving

play01:44

the space here secondly what's harder to

play01:46

see is Google has something because I

play01:48

have a fat thick a thick blue line here

play01:50

Google has some real outlier returns

play01:52

here on the positive side the actual

play01:55

asset returns are typically fat-tailed

play01:58

they have fatter tails technically

play02:00

that's called lepto kurtosis so now let

play02:03

me take a closer look at the normal

play02:05

distribution function

play02:06

density function so here it is plotted

play02:09

the plot is an implementation here of

play02:12

this

play02:13

formula which has pi in it as well as

play02:17

the natural e that has a value about 2.7

play02:20

1 in addition to that really there are

play02:23

embedded the two parameters that we need

play02:26

for the normal distribution here's

play02:28

volatility that's that measure of

play02:30

dispersion or that second moment measure

play02:33

we need that

play02:34

and here's Z a standardized variable

play02:39

that is a function of the mean and the

play02:44

standard deviation so embedded in the Z

play02:46

we've got the mean and the volatility

play02:49

those are the two parameters we need for

play02:53

the standard deviation I've got a mean

play02:55

here input and I've got a standard

play02:57

deviation here of one so I have any

play03:00

numbers in here I'm going to go ahead

play03:01

and put in standard values the standard

play03:04

normal distribution by definition when I

play03:07

say standard it means there's a mean of

play03:09

zero and a standard deviation of one

play03:11

such that my X values here equal my

play03:15

standard variables and then I can plot

play03:18

this normal distribution this is a

play03:22

density function also called probability

play03:24

mass function here over to the right so

play03:27

as you can see I've done that two ways

play03:29

first here I use the built in excel

play03:33

function equals norm dist takes four

play03:37

parameters first here I've got b7 first

play03:42

it takes the x value so that's the value

play03:45

of x on the x-axis it takes the mean

play03:49

which I've got a zero

play03:50

it takes the standard deviation I've got

play03:52

a 1 and then it takes a true if we want

play03:55

a cumulative distribution or a false if

play03:57

we want the density function which we do

play03:59

want so that's my function that's my

play04:02

straight use of the excel function and

play04:04

then just to prove to you that the

play04:05

formula is correct here instead of a

play04:08

function I've got a formula and that

play04:10

formula right here matches this so it

play04:13

gives me Y as a function of X the

play04:19

standard variable here the in this case

play04:23

for this first point it's a negative 4

play04:24

so that negative 4 is my Z right

play04:27

here having that and then also accessing

play04:30

my also using as an input my standard

play04:32

deviation I solve for y which is then

play04:35

going to be that probability here so

play04:38

that's the Y access value here a Y value

play04:40

here here's here's the formula which

play04:43

implements this which is equivalent to

play04:45

the built-in function and that gives me

play04:49

none

play04:50

I can now I've got a mean of 0 here but

play04:53

if you watch this chart I'm going to

play04:56

just change the mean to 10 for example

play04:57

and the curve doesn't change at all the

play05:01

curves very well behaved that way but

play05:03

the mean changes to 10 and now I'm going

play05:05

to change the standard deviation to 5

play05:07

for example so watch the chart well ok

play05:11

it's hard to see the x axis changes but

play05:13

this the x axis change so the dispersion

play05:17

became greater and i'll change the

play05:18

standard deviation back to 1 but I can

play05:20

change the mean of the standard

play05:21

deviation the point of this that I

play05:24

wanted to show you is that I only change

play05:28

those two values one of the

play05:30

characteristics of the normal

play05:31

distribution is that it's fully

play05:33

described by only two parameters the

play05:35

mean that sets this that sets the peak

play05:39

of the curve here and the what I've been

play05:42

calling interchangeably the volatility

play05:45

of the standard deviation that's the

play05:47

measure of dispersion or scatter that

play05:51

tells me how wide this is that's it

play05:54

aside from that this tendency of the

play05:57

tails to be skinny relative to real

play05:59

returns that's just a weakness or flaw

play06:01

of the normal so another characteristic

play06:04

is we have these rules of thumb which

play06:06

you may have seen before that refer to

play06:09

area under the curve so if I just look

play06:12

back at this this is the density

play06:14

function the cumulative function is

play06:16

implied by the area under the curve so

play06:20

if we talk about how much area under the

play06:22

curve goes plus or minus one standard

play06:24

deviation to the left and right well

play06:27

then we have these rules of thumb about

play06:29

68% of the area under the curve is

play06:32

covered by plus or minus one standard

play06:34

deviation about 95 and a half percent is

play06:37

covered by two standard deviations about

play06:40

99

play06:41

seven is covered by three standard

play06:43

deviations so those are rules of thumb

play06:45

that tell us how much of the area under

play06:48

the curve is covered by the distance

play06:51

plus or minus one two and three standard

play06:53

deviations respectively so finally

play06:56

before I finish let me just explain the

play06:58

properties of the normal distribution

play07:01

first it's continuous it's not discrete

play07:05

notice this line is solid not dotted

play07:10

technically speaking if I want to check

play07:12

for the probability of value I have to

play07:14

check an interval because there are an

play07:16

infinite number of points on this line

play07:18

that would be as opposed to a discrete

play07:19

variable or discrete distribution such

play07:22

as the binomial or purse on as I

play07:26

mentioned second characteristic there's

play07:27

only two parameters fully described by

play07:30

mean and volatility or standard

play07:33

deviation this makes it very convenient

play07:35

and handy to use many distributions are

play07:38

more complex require more parameters and

play07:42

therefore subject to well just more

play07:45

model risk or input variation third we

play07:50

say the normal is symmetrical it has a

play07:53

or technically it has a skew of zero

play07:55

fourth we say there is no excess

play07:58

kurtosis the there are no fat tails the

play08:02

kurtosis therefore on the normal is

play08:04

going to be three so we say the kurtosis

play08:06

is three or the excess kurtosis is zero

play08:08

same thing why is it so common well

play08:12

there are really two reasons and I would

play08:14

argue there are only two reasons there

play08:16

is nothing inherently special about the

play08:19

normal distribution it is not

play08:21

necessarily appropriate for risk metrics

play08:25

and nothing elevates at any special

play08:28

importance in the study of risk simply

play08:31

because risk is concerned with the tail

play08:32

the strength of the normal is about the

play08:35

central tendency the two reasons is

play08:38

common are one it's easy due to these

play08:40

parameters and two this is very profound

play08:43

the central limit theorem the central

play08:46

limit theorem says the sampling mean of

play08:49

the samples tends toward become normal

play08:55

normally distributed well that is a

play08:58

statement about the central tendency of

play09:01

the normal it's not really a statement

play09:03

about the tails and the proteins of the

play09:04

tails therefore it's not necessarily

play09:07

doesn't necessarily have any special

play09:08

significance in the context of risk but

play09:11

the central limit theorem does explain

play09:13

why it's so prevalent as a distribution

play09:16

so this is David Harper of the monic

play09:18

turtle thanks for your time

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Normal DistributionFinancial AnalysisProbability TheoryAsset ReturnsStatistical TutorialData HistogramBell CurveGaussian DistributionLeptokurtosisRisk MetricsCentral Limit Theorem