Statistics for Psychology

SWARTWOODPREP
13 May 201809:01

Summary

TLDRThis video script offers an informative overview of the normal distribution, a fundamental concept in statistics. It explains the characteristics of a normal distribution curve, including symmetry, the mean, median, and mode aligning at the center, and its asymptotic nature. The script delves into the significance of the mean and standard deviation, illustrating how they define the spread of data points. It also covers the '68-95-99.7' empirical rule, which quantifies the proportion of data within one, two, or three standard deviations from the mean. The presenter uses the example of gummy bear consumption to demonstrate how to calculate and interpret z-scores, providing a practical application of normal distribution in real-world scenarios.

Takeaways

  • πŸ“š The normal distribution is a fundamental concept in statistics, characterized by its bell shape and symmetry.
  • πŸ” The mean, median, and mode of a normal distribution all coincide at the center of the distribution curve.
  • πŸ“‰ The normal distribution is asymptotic, meaning it extends indefinitely in both directions without ever reaching zero.
  • 🌐 The distribution is useful for modeling real-world phenomena because it captures the majority of data within a few standard deviations from the mean.
  • πŸ“ˆ The probability of data points falling within one standard deviation from the mean is approximately 68%, two standard deviations capture about 95%, and three standard deviations nearly 100%.
  • πŸ“Š Normal distributions can vary in their means and standard deviations, but the key property of capturing a certain percentage of data within specific standard deviations remains consistent.
  • πŸ”’ The mean represents the average value, while the standard deviation measures the spread or dispersion of the data around the mean.
  • πŸ€” Understanding the standard deviation helps in gauging how typical a data point is; for example, a person eating 100 pounds of gummy bears per day with a standard deviation of 10 would be considered normal.
  • πŸ“š To work with different normal distributions, z-scores are used to standardize the data, making it easier to compare and interpret using a single table.
  • βž— The z-score is calculated by subtracting the mean from the data point and then dividing by the standard deviation, indicating how many standard deviations away from the mean the data point lies.
  • πŸ”Ž By using z-scores and referring to a standard normal distribution table, one can determine the probability of a data point occurring beyond a certain threshold, like the likelihood of someone eating more than 140 pounds of gummy bears per day.

Q & A

  • What is the shape of a normal distribution curve?

    -The normal distribution curve is bell-shaped, symmetrical around its mean, with the mean, median, and mode all coinciding at the center of the curve.

  • Why is the normal distribution considered to be asymptotic?

    -The normal distribution is considered asymptotic because it theoretically extends indefinitely in both directions without ever reaching zero, although for practical purposes, it captures nearly all data within a few standard deviations from the mean.

  • What is the significance of the mean, median, and mode being equal in a normal distribution?

    -The equality of the mean, median, and mode in a normal distribution signifies that the data is perfectly symmetrical, and the central tendency measures are consistent, reflecting a balanced distribution of data points around the center.

  • How does the normal distribution help in modeling real-world phenomena?

    -The normal distribution is useful for modeling real-world phenomena because it captures the central limit theorem, where the sum of a large number of independent and identically distributed variables tends to form a normal distribution, making it a common statistical model for various natural and social phenomena.

  • What is the Empirical Rule in relation to the normal distribution?

    -The Empirical Rule states that for a normal distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

  • What is a z-score and how is it calculated?

    -A z-score is a measure of how many standard deviations an element is from the mean in a normal distribution. It is calculated by subtracting the mean from the data point and then dividing by the standard deviation.

  • Why is standardizing scores into z-scores useful in statistics?

    -Standardizing scores into z-scores is useful because it allows for easy comparison of data across different normal distributions, as it transforms the data into a common scale where the mean is 0 and the standard deviation is 1, facilitating the use of standard tables for probability calculations.

  • What is the relationship between a z-score and the probability of a data point occurring?

    -The z-score indicates the number of standard deviations a data point is from the mean, and by looking up the z-score in a standard normal distribution table, one can determine the probability or percentage of data points occurring at that distance from the mean.

  • How can you find the probability of a data point being beyond a certain value in a normal distribution?

    -To find the probability of a data point being beyond a certain value, calculate the z-score for that value, look up the corresponding area in a standard normal distribution table, and then subtract this area from 0.5 if you want the probability beyond that value in one tail, or from 1 if considering both tails.

  • What is an example of a practical scenario where the normal distribution is applied as described in the script?

    -An example given in the script is determining the likelihood of a person eating more than 140 pounds of gummy bears per day, assuming the average consumption is 100 pounds with a standard deviation of 10, by calculating the z-score for 140 and using a standard normal distribution table to find the probability.

  • How does the script illustrate the concept of 'freakishly high' or 'freakishly low' in the context of the normal distribution?

    -The script uses the phrase 'freakishly high' or 'freakishly low' to describe data points that are more than one standard deviation above or below the mean, indicating that these occurrences are less common and deviate significantly from the norm.

Outlines

00:00

πŸ“Š Introduction to the Normal Distribution

The speaker introduces the concept of the normal distribution, highlighting its importance and common characteristics. They explain that the normal distribution is symmetrical, with the mean, median, and mode all located at the center. The concept of asymptotic behavior is also mentioned, where the curve never touches the x-axis. The speaker emphasizes the practical relevance of the normal distribution in modeling various phenomena.

05:02

πŸ“ Standard Deviation and the 68-95-99.7 Rule

The speaker explains the properties of the normal distribution related to standard deviations. They describe how moving one standard deviation away from the mean captures about 68% of the data, two standard deviations capture about 95%, and three standard deviations capture almost all of the data (99.7%). This section underscores the utility of the normal distribution in understanding data spread and probabilities.

🍬 Applying the Normal Distribution: Gummy Bears Example

A practical example involving gummy bear consumption is used to illustrate the application of the normal distribution. The speaker sets up a problem where the mean consumption is 100 pounds per day with a standard deviation of 10 pounds. They discuss how to calculate the probability of randomly selecting someone who eats more than 140 pounds of gummy bears per day by converting this problem into a z-score and looking up the corresponding probability.

πŸ”’ Understanding and Using Z-Scores

The concept of z-scores is introduced as a method to standardize scores from different normal distributions. The speaker explains how to convert raw scores into z-scores, which represent the number of standard deviations a data point is from the mean. The utility of z-scores in simplifying the lookup of probabilities using standard tables is discussed, along with an example calculation involving a score of 140.

πŸ“ˆ Calculating Probabilities Using Z-Scores

The speaker demonstrates the procedure for calculating probabilities using z-scores. They explain how to look up z-scores in tables and interpret the results. An example is provided where a z-score of +4 is used to find the probability of consuming more than 140 pounds of gummy bears per day. The speaker discusses different types of tables and how to adjust calculations based on the provided data.

πŸ“ Recap and Procedure for Calculating Z-Scores

A summary of the steps involved in calculating z-scores is provided. The speaker reiterates the process: subtracting the mean from the raw score and then dividing by the standard deviation. This section serves as a concise recap of the key points discussed throughout the video, emphasizing the practical application of the normal distribution and z-scores in statistical analysis.

Mindmap

Keywords

πŸ’‘Normal Distribution

Normal distribution, also known as Gaussian distribution, is a probability distribution that is characterized by its symmetric bell-shaped curve. In the video, it is explained as a fundamental concept for understanding statistical concepts. The mean, median, and mode all coincide at the center of this distribution, which is a key feature used to describe data that follows a normal distribution pattern. The script uses the normal distribution to discuss the likelihood of certain outcomes, such as eating a certain amount of gummy bears.

πŸ’‘Mean

The mean, often referred to as the average, is a measure of central tendency in statistics. It is calculated by summing all the values in a data set and then dividing by the number of values. In the context of the video, the mean is the central point of the normal distribution where the average value lies, and it is used as a reference point to calculate deviations in the data.

πŸ’‘Median

The median is another measure of central tendency, representing the middle value of a data set when it is ordered from least to greatest. In the video, it is mentioned that in a normal distribution, the median coincides with the mean, indicating that half of the data lies above and half below this central value.

πŸ’‘Mode

The mode is the value that appears most frequently in a data set. In the video, it is noted that in a normal distribution, the mode also coincides with the mean and median, meaning the most common value is at the center of the distribution.

πŸ’‘Standard Deviation

Standard deviation is a measure of the amount of variation or dispersion in a set of values. It indicates how much individual data points in the set typically deviate from the mean. In the video, standard deviation is used to describe the spread of the normal distribution and to calculate z-scores, which are essential for determining the likelihood of certain events.

πŸ’‘Asymptotic

Asymptotic refers to a property of a function or curve that approaches a certain value or line but never actually reaches it. In the script, it is mentioned that the normal distribution is asymptotic, meaning it extends indefinitely without ever touching the x-axis, which is used to illustrate the behavior of data that is far from the mean.

πŸ’‘Z-Score

A z-score is a standard score that indicates how many standard deviations an element is from the mean. In the video, z-scores are used to standardize the data from different normal distributions, allowing for the use of a single table to determine probabilities associated with different scores.

πŸ’‘Sigma (Ξ£)

Sigma is the Greek letter used to represent the standard deviation in statistics. In the video, 'Sigma' is used to denote the standard deviation when discussing the properties of the normal distribution and when calculating z-scores.

πŸ’‘Empirical Rule

The empirical rule, also known as the 68-95-99.7 rule, is a shorthand used to remember the percentage of values that lie within one, two, and three standard deviations from the mean in a normal distribution. The video uses this rule to explain the probabilities associated with different ranges of data, such as the likelihood of eating more than 140 pounds of gummy bears.

πŸ’‘Frequency Distribution

Frequency distribution is a way to summarize data by showing the number of occurrences of each value or range of values in a data set. In the video, the concept is used to describe how data is organized and visualized in a normal distribution, with the frequency of certain outcomes like eating a specific amount of gummy bears.

πŸ’‘Probability

Probability is a measure of the likelihood that a particular event will occur, often expressed as a number between 0 and 1. In the video, probability is discussed in the context of the normal distribution to determine the likelihood of rare events, such as consuming an unusually high amount of gummy bears.

Highlights

Introduction to the normal distribution and its importance in statistical analysis.

Explanation of the bell curve shape of the normal distribution and its properties.

The mean, median, and mode all coincide in a normal distribution, indicating symmetry.

Asymptotic nature of the normal distribution and its implications for modeling real-world phenomena.

The practicality of the normal distribution in modeling despite its asymptotic behavior.

Understanding the 68-95-99.7 empirical rule for standard deviations in a normal distribution.

The role of the mean as the central value in a normal distribution.

Clarification of the standard deviation as a measure of spread in the data.

Intuitive understanding of standard deviation in relation to the mean.

The concept of 'normal' in the context of standard deviations from the mean.

Illustration of the distribution of gummy bear consumption as an example of a normal distribution.

Procedure for calculating the likelihood of a random individual consuming an unusual amount of gummy bears.

The process of standardizing scores into z-scores for easier comparison across different normal distributions.

Explanation of z-scores as a way to express how many standard deviations a score is from the mean.

Using z-scores to find the probability of a score occurring in a normal distribution.

How to look up z-scores in a standard normal distribution table to find the area under the curve.

Adjusting the area found in the table to find the exact probability for a given score.

General procedure for converting a raw score to a z-score and using it to find probabilities.

Transcripts

play00:03

[Music]

play00:07

so what are you guys hi guys so before

play00:13

we can get start with anything we need

play00:14

to understand a little bit about the

play00:15

normal distribution so um let me talk

play00:17

briefly about it although you probably

play00:27

already seen this I mean from the last

play00:30

midterm it is good to talk briefly about

play00:32

it just to roll up to speed

play00:34

no.1 distribution looks kind of like

play00:36

this bow not this bad but kind of like

play00:38

this okay

play00:39

and you know the mean the population

play00:41

mean sits right here so the average sits

play00:43

right there but also the median the mode

play00:44

also sit right here so a couple things

play00:46

you have the mean is equal to the median

play00:50

is equal to the mode okay the second

play00:55

thing is he's pretty symmetric so if you

play00:57

look at this guy he's symmetric and you

play01:03

can kind of seen on the pictures say for

play01:04

the fact that I kind of messed it up but

play01:06

number three is of course he's on

play01:07

asymptotic this one's not so big a deal

play01:14

all it means is this guy just keeps

play01:15

going on and on forever and never quite

play01:17

flatlines at zero okay so you might be

play01:19

thinking something like so what good is

play01:20

that gonna do us right cuz how many

play01:22

things are asymptotic like things we're

play01:23

gonna model they're not gonna go

play01:24

infinitely high and implement low right

play01:26

but for all effective purposes once you

play01:28

get outside of 300 deviations above and

play01:30

below the mean you've pretty much gotten

play01:32

this entire curve so in that sense the

play01:34

normal distribution is good okay it's

play01:36

going to model a lot of things really

play01:37

really well and we'll see it is the

play01:39

distribution to go to in a second okay

play01:41

but first how do I use this sucker so

play01:43

these are normal distributions have

play01:45

different means and different standard

play01:46

deviations right but they all have this

play01:49

property that when you look at the mean

play01:50

if you go out one standard deviation to

play01:53

the right or to the left you'll always

play01:55

capture basically 60% of this curve some

play02:02

props have you learned memorize that

play02:04

some don't if you go out to standard

play02:06

deviations so two Sigma

play02:10

right then you're gonna capture about

play02:16

95% okay and if you go out three like we

play02:20

said get pretty much everything I think

play02:23

you get like ninety nine point seven

play02:25

okay and that's four three standard

play02:27

deviations so three sigma okay so I

play02:30

don't wanna spend too much time in this

play02:31

since you solve this on your last

play02:32

midterm but we should talk to me and of

play02:34

course represents the population mean so

play02:36

this is the average in this standard

play02:41

sense of the word right and Sigma the

play02:44

standard deviation that represents how

play02:46

far from the average is a guy on average

play02:48

okay so how far from the average are you

play02:50

on average so it's a measure of spread

play02:55

okay okay so what's that kinda mean

play02:57

intuitively let's say the meanest a

play02:58

hundred and let's say the standard

play03:00

deviation is like ten that means if you

play03:02

pick somebody random it's not that

play03:04

unusual to find someone between say 100

play03:06

and 110 and it's not that unusual find

play03:08

someone between say 90 and 100 so you

play03:10

can go up and down by a standard

play03:12

deviation and you're kind of average in

play03:13

a sense okay are pretty normal okay

play03:15

beyond that then you start to get a

play03:17

little freakishly high or a little

play03:18

freakishly low and all that good stuff

play03:20

okay okay it's no big deal so I get it

play03:23

the average is the average the standard

play03:25

deviation is generally how spread out

play03:26

guys are if you're within a standard

play03:27

deviation you're pretty quote normal

play03:29

whatever that means right and if you're

play03:30

more than one standard deviation above

play03:31

or more than one standard deviation

play03:32

below then you're starting to get like a

play03:34

little freakishly high or freakishly low

play03:35

okay so no big deal okay so um good then

play03:40

let's do a problem with this again I'm

play03:42

gonna do it kind of quickly because I've

play03:44

seen from the previous midterm you're

play03:45

comfortable with this but just in case

play03:49

okay

play03:52

so let's say you've got a standard

play03:54

deviation where the average eats 100

play03:56

pounds of gummy bears per day okay and

play03:58

let's say for example like the standard

play04:00

deviation is 10 okay so that means most

play04:03

people out there will eat between 90 to

play04:05

110 pounds gummy bears per day okay and

play04:07

I want to know what's the likelihood you

play04:09

pick somebody at random and that person

play04:11

eats more than so let's say more than

play04:20

let's say 140 pounds of gummy bears okay

play04:24

so I don't know what's the likelihood

play04:25

that you pick somebody at random from

play04:27

the population and you find that they

play04:28

eat more than 140 pounds of gummy bears

play04:30

per day okay okay so what's the set up

play04:33

as always we'll draw in that average

play04:35

here for 100 right okay remember this is

play04:38

a frequency distribution so you line up

play04:39

all the different possible scores and

play04:41

technically you're going on the

play04:42

population you're asking people how many

play04:44

pounds gummy bears do you eat right and

play04:45

if a bunch of people get 100 then they

play04:47

get a high mark over here and if not so

play04:48

many people you'd a 150 pounds of gummy

play04:50

bears per day then they're kind of over

play04:51

here

play04:52

you know the mark 4 that's low okay no

play04:53

big deal okay so let's try that so first

play04:56

thing I do is you can talk about numbers

play04:59

right but this would suck because if I

play05:01

had a different normal distribution I'd

play05:03

have a different mean a different

play05:04

standard deviation I have to look each

play05:05

one of these guys up and I'd have to

play05:07

have a separate table for each and

play05:08

that's really a pain in the butt so what

play05:10

I want to do instead is we want to what

play05:12

we want to do is we want to standardize

play05:13

it so it's a standardized it what you do

play05:15

is you take this guy and you convert the

play05:17

actual scores into something called

play05:19

z-scores

play05:20

this is just for archimedes z-scores are

play05:22

a nice nice way of converting these

play05:24

scores we could use just one table to

play05:25

figure things out okay and you know that

play05:28

kind of makes sense because remember the

play05:29

defining property of the normal curve in

play05:30

a way is the fact that once you get one

play05:32

standard deviation above and below you

play05:34

always hit that same percentage like 68%

play05:36

okay and if you go to you always hit 95

play05:38

cetera cetera okay so let's do that so

play05:41

how do I what's a z-score mean so you

play05:44

guys remember represents the number of

play05:45

standard deviations above or below the

play05:47

mean your score is so the first thing I

play05:50

do is convert so if you look at that 140

play05:51

right I want to know how many standard

play05:53

deviations above or below the mean is

play05:54

140 so the first thing is I take 140 and

play05:56

have subtract 100 for a minute everybody

play05:59

agrees so let's just plot it down here

play06:00

so here's 140 and we know it's here and

play06:03

we know this difference is 40 and that's

play06:06

what we got right but let's say you had

play06:08

24 eggs and I wanted to know how many

play06:10

dozen you have if that's the case you

play06:12

would take your 24 eggs you divide by

play06:14

the number in a dozen to say you have

play06:16

two dozen

play06:16

same thing right so over here you've got

play06:18

40 points of difference but I want to

play06:20

know how many standard deviations fit in

play06:22

there so you look at your 40 points and

play06:25

you divide by the number of points in

play06:26

the standard deviation or to be

play06:28

ten and that would give you four well

play06:31

that makes sense right is your forty

play06:32

points above the mean right

play06:34

but each standard deviation is worth ten

play06:36

points so if you're forty points above

play06:38

and each standard deviations worth ten

play06:39

you really are four standard deviations

play06:41

above the mean so that's what a z-score

play06:43

represents so if you had a z-score of

play06:45

say plus four that means you are four

play06:48

standard deviations above the mean if

play06:50

you had a z-score say like negative two

play06:51

that would mean you're two standard

play06:52

deviations below the mean okay so this

play06:54

number that we ended up with is

play06:55

ridiculously high but that's fine we can

play06:57

go with that so we now have a z-score is

play07:00

plus four okay we get a couple of tricks

play07:03

depending on the sort of table your book

play07:05

uses you got to remember the fact that

play07:07

the entire curve is 100% and half the

play07:09

curve is of course 50% and you can use

play07:11

tricks for example some books are really

play07:12

nice once we look up the z-score and we

play07:15

convert the z-score to plus four right

play07:17

so that's origines score then you can

play07:19

look this up in a table or just two high

play07:20

number but let's pretend let's pretend

play07:21

for is actually in your book then it

play07:23

would go here maybe I would draw all

play07:25

this right if it gives the area of this

play07:27

shaded region or the percentage for this

play07:29

shaded region then we're good so if this

play07:31

were like for example make up a fake

play07:32

number point zero zero one like that

play07:35

totally fake number but let's say that's

play07:37

what the table gave you then that's the

play07:38

prickly hood that you're going to get a

play07:40

score 140 or more and that's what we

play07:42

want okay

play07:42

however if your table didn't give you

play07:45

this but only gave you this and let's

play07:48

say this was something like point four

play07:50

nine nine like that right then you would

play07:52

know 140 to a hundred well to the mean

play07:55

would be point four nine nine but you

play07:58

want 140 and up and you agree the whole

play08:00

thing over here is 50%

play08:02

so that would be 0.5 0 0 minus 0.49 9

play08:07

which would be point zero zero one okay

play08:10

just talking about you know you might

play08:12

have to make minor adjustments depending

play08:13

on the sort of table they give you but

play08:15

the procedures still the same take your

play08:17

score convert it to a z-score look up

play08:19

that z-score value in the table and then

play08:21

use whatever they give you the table to

play08:23

figure out your answer okay so let me

play08:25

outline the general procedure again all

play08:28

the way you can run it backwards but

play08:30

since I feel like most people are

play08:31

probably comfortable at this you don't

play08:33

want to take the back will do a poem

play08:34

with this in a second okay but first

play08:35

let's remember the procedure my dad I

play08:38

should have written this out last time

play08:39

least I'm doing it now remember the

play08:41

general procedure

play08:42

z-score you took your score you

play08:45

subtracted the mean right so remember

play08:48

before we had 140 we subtracted the mean

play08:50

is a hundred and then we divided by the

play08:52

number of points in a standard deviation

play08:53

so that's a general procedure for

play08:55

getting the z-score okay

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Normal DistributionStatisticsZ-ScoresMathematicsStandard DeviationMeanProbabilityData AnalysisEducationTutorial