Intro to Z-scores

Matt Teachout
17 Apr 202012:36

Summary

TLDRThis video introduces the concept of Z-scores, focusing on their importance in statistics, particularly with normal, quantitative data. It explains how Z-scores are calculated using the mean and standard deviation, and how they help compare individual data points to the group average. The presenter uses examples like IQ scores to demonstrate Z-score calculation and interpretation, emphasizing how Z-scores indicate whether a value is above or below the mean, and how they help identify typical or unusual data points. The video provides a foundational understanding for further discussions on Z-scores.

Takeaways

  • 📊 Z-scores are crucial for normal quantitative data and are used in various statistical scenarios like critical values, confidence intervals, and test statistics for proportions.
  • 📉 Z-scores are based on normal distribution, where the mean and standard deviation accurately represent the data. The data must be normally distributed for the Z-score to be valid.
  • 📏 The formula for calculating a Z-score involves taking the data value, subtracting the mean, and dividing by the standard deviation.
  • 👍 A positive Z-score indicates a value above the mean, while a negative Z-score means the value is below the mean.
  • 🧠 Example: In an IQ test with a mean of 100 and a standard deviation of 15, Maria’s IQ of 147 has a Z-score of 3.13, meaning her score is 3.13 standard deviations above the mean.
  • 📐 Z-scores can be used to identify outliers, where values two standard deviations above or below the mean (Z-scores greater than or equal to 2 or less than or equal to -2) are considered unusual.
  • 👥 Z-scores between -1 and 1 are typical, as they represent data that falls within one standard deviation of the mean, covering around 68% of normally distributed data.
  • 🧮 Z-scores are not percentages, proportions, or units like dollars or miles. They are measured in terms of standard deviations, a way to standardize and compare data across different scales.
  • 🔍 Z-scores also help in determining statistical significance, where Z-scores beyond certain thresholds indicate significantly high or low data points.
  • 📚 The script emphasizes the importance of understanding Z-scores as a tool for comparing data and identifying whether data points are typical, unusual, or outliers.

Q & A

  • What is a z-score?

    -A z-score represents the number of standard deviations a data point is from the mean. It is used to compare an individual data point to the overall dataset.

  • When should z-scores be used?

    -Z-scores should be used when working with normal or bell-shaped data, as the calculation relies on accurate mean and standard deviation values.

  • How do you calculate a z-score?

    -To calculate a z-score, subtract the mean from the data value, then divide the result by the standard deviation. The formula is: (data value - mean) / standard deviation.

  • What does a positive z-score indicate?

    -A positive z-score indicates that the data value is above the mean.

  • What does a negative z-score indicate?

    -A negative z-score indicates that the data value is below the mean.

  • How do you interpret z-scores in terms of outliers?

    -A z-score greater than or equal to 2 indicates a high outlier, while a z-score less than or equal to -2 indicates a low outlier.

  • What is considered a typical z-score range?

    -A typical z-score falls between -1 and 1, which corresponds to the middle 68% of values in a normal distribution.

  • How would you calculate the z-score for Maria's IQ of 147 if the mean is 100 and the standard deviation is 15?

    -To calculate Maria's z-score, subtract 100 from 147 to get 47, then divide 47 by 15. The result is a z-score of 3.13, indicating that Maria's IQ is 3.13 standard deviations above the mean.

  • What does a z-score of 3.13 for Maria’s IQ mean?

    -Maria’s z-score of 3.13 means her IQ is 3.13 standard deviations above the mean, indicating that she has an unusually high IQ compared to the general population.

  • How do z-scores relate to significance in statistics?

    -Z-scores are often used in significance testing. Values greater than or equal to 2 (or less than or equal to -2) are considered unusual and may indicate statistical significance.

Outlines

00:00

📊 Introduction to Z-Scores and Their Importance in Statistics

This paragraph introduces the concept of Z-scores, highlighting their significance in statistics, especially in analyzing normal quantitative data. The Z-score is presented as a critical value and test statistic, particularly for proportions, and is useful for comparing individual data points to the overall distribution. It emphasizes the importance of having normally distributed data for the Z-score to be accurate, as the calculation is based on the mean and standard deviation of the data.

05:01

🧮 Calculating Z-Scores with an Example

This paragraph explains how to calculate a Z-score using a practical example involving IQ scores. Maria's IQ of 147 is compared to the mean IQ of 100 with a standard deviation of 15, resulting in a Z-score of 3.13. The paragraph also touches on the rounding conventions in Z-score calculations, influenced by historical practices. The positive Z-score indicates that Maria's IQ is above the average, illustrating the concept of standard deviations as a measure of comparison.

10:03

📈 Interpreting Z-Scores and Their Significance

Here, the paragraph discusses the interpretation of Z-scores, specifically what a positive or negative Z-score indicates. It further elaborates on how Z-scores can help identify outliers and unusual values in data. The concept of typical values, represented by Z-scores between -1 and 1, is introduced. The paragraph uses Maria's high Z-score to illustrate how values above 2 are considered unusually high, thus making Maria's IQ unusually high.

🧠 Z-Scores and Typical Versus Unusual Values

The final paragraph provides another example using Rick's IQ of 87, which yields a negative Z-score of -0.87. This score is within the typical range (-1 to 1), meaning Rick's IQ is typical compared to the general population. The paragraph emphasizes the importance of understanding that Z-scores represent standard deviations, not percentages or proportions, and clarifies that not all data points fall into typical or unusual categories. This serves as a conclusion to the introductory discussion on Z-scores, with a promise of deeper exploration in future lessons.

Mindmap

Keywords

💡Z-score

A Z-score measures how many standard deviations a data point is from the mean of a data set. It is used to standardize scores in a normal distribution, allowing comparison of individual values across different data sets. In the video, Z-scores help determine how an individual's score, like IQ, compares to the average population.

💡Standard deviation

Standard deviation measures the amount of variation or dispersion in a set of values. A low standard deviation means the data points are close to the mean, while a high standard deviation indicates that they are spread out. In the video, it is crucial for calculating Z-scores, as the Z-score is the number of standard deviations a data value is away from the mean.

💡Mean

The mean is the average of a data set, calculated by summing all values and dividing by the number of values. In the video, the mean is used as a reference point for calculating Z-scores, which measure how far a particular data point is from the average value.

💡Normal distribution

A normal distribution is a bell-shaped curve where most data points cluster around the mean, with fewer data points appearing as you move away from the center. Z-scores are specifically used with normally distributed data, as mentioned in the video, to assess how typical or unusual a particular data value is.

💡Critical values

Critical values are the cutoff points in hypothesis testing that determine whether a test statistic falls in the rejection region. In the video, critical values are linked to Z-scores and confidence intervals, helping identify whether a result is statistically significant.

💡Confidence intervals

A confidence interval is a range of values within which a population parameter is expected to lie, with a certain degree of confidence. Z-scores are used to calculate confidence intervals in statistics, especially in determining whether a population parameter falls within a given range.

💡Population mean (μ)

The population mean, denoted by the Greek letter μ (mu), is the average of all values in a population. It is contrasted with the sample mean (X̄) in the video, where Z-scores are sometimes calculated using the population mean instead of the sample mean to understand how individual data points compare to the overall population.

💡Sample standard deviation (s)

The sample standard deviation, denoted by 's', is a measure of how spread out sample data is from the sample mean. In the video, the sample standard deviation is essential for Z-score calculations when working with sample data, as it helps measure the relative position of individual data points within the sample.

💡Unusual values

Unusual values are data points that lie two or more standard deviations away from the mean. These values are often considered outliers or significant in the context of statistical analysis. In the video, Maria’s IQ is considered unusually high because her Z-score is greater than 2, making her score an outlier.

💡Typical values

Typical values are data points that fall within one standard deviation of the mean. These values are considered common or expected within a normal distribution. In the video, Rick’s IQ score has a Z-score of -0.87, which places him within the typical range, indicating his IQ is close to the average.

Highlights

Introduction to Z-scores and their widespread use in statistics for normal quantitative data, critical values, confidence intervals, and test statistics.

Z-scores require data to be normally distributed for accuracy, as they rely on the mean and standard deviation.

Mean and standard deviation are the most accurate descriptors of normal data distributions.

The formula for a Z-score is the data value minus the mean, divided by the standard deviation.

Positive Z-scores indicate a value above the mean, while negative Z-scores indicate a value below the mean.

Example of calculating a Z-score: Maria's IQ of 147, with a mean of 100 and standard deviation of 15, results in a Z-score of 3.13.

The importance of rounding Z-scores to the hundredths place, often a tradition from the pre-computer era when Z-score charts were used.

A Z-score represents the number of standard deviations a value is from the mean, making it a standardized measure.

Z-scores can be used to identify outliers: values more than 2 standard deviations from the mean are considered unusual.

Typical values in a normal distribution have Z-scores between -1 and 1, encompassing about 68% of the data.

Maria's Z-score of 3.13 indicates an unusually high IQ, as it exceeds the threshold of 2 standard deviations above the mean.

Rick's IQ of 87 results in a Z-score of -0.87, indicating a typical IQ, as it falls between -1 and 1.

A Z-score of 1.5 is neither typical nor unusual, falling in a gray area between typical and outlier values.

Z-scores provide a standardized comparison, allowing values from different data sets to be compared on the same scale.

Z-scores will be used throughout the course, both for calculation and interpretation in various statistical contexts.

Transcripts

play00:03

hi everyone this is Matty show with

play00:05

intro stats today we are looking at Z

play00:09

scores Z scores they're very famous

play00:12

especially for normal quantitative data

play00:15

we also use them in a lot of situations

play00:18

critical values and confidence intervals

play00:21

we also use them as test statistics

play00:24

mainly for proportions so there's a lot

play00:27

of uses of Z scores in statistics and so

play00:30

I wanted to kind of introduce what is

play00:32

the z-score and sort of how does it work

play00:33

so the first thing to remember is

play00:36

z-scores really go with normal data the

play00:38

data does have to be normal for the

play00:41

z-score to be accurate if you remember

play00:43

last time when we talked about normal

play00:45

quantitative data we said that the most

play00:49

accurate average receptor is the mean

play00:52

and the most accurate spread is the

play00:55

standard deviation and those two

play00:56

statistics were only accurate if the

play00:59

data was normal or bell-shaped the the

play01:03

z-score calculation is based on the mean

play01:06

and standard deviation being accurate so

play01:08

you want to make sure that your data is

play01:11

normal before you start looking at

play01:13

z-scores all right so just a couple

play01:16

things we saw the last time that the

play01:20

mean of a dataset is often denoted as an

play01:24

X with a bar over it that usually that

play01:26

symbol just means the mean of a sample

play01:28

data set also we saw that the standard

play01:32

deviation is s or the sample standard

play01:36

deviation occasionally though you will

play01:38

see and we'll kind of get more into this

play01:40

in the next next unit but there are

play01:43

other letters that you'll see in stats

play01:46

this letter right here that looks like a

play01:48

U of the tail is the Greek letter mu and

play01:51

it's often denoted as a population mean

play01:54

so if you knew the population mean

play01:57

average this symbol here is another

play01:59

Greek letter called Sigma again we'll

play02:02

get more into these letters in the next

play02:04

unit so don't worry too much about it

play02:06

right now but this this this letter

play02:08

right here this Greek letter Sigma

play02:10

is usually denoted as a population

play02:12

standard deviation so if you're talking

play02:14

about standard deviation of the entire

play02:16

population and that would be Sigma so

play02:19

you'll see these letters sometimes in

play02:21

z-score formulas in stat books all right

play02:24

so what's a z-score so a z-score

play02:26

basically counts the number of standard

play02:29

deviations above or below the mean okay

play02:33

so it's really used as sort of a

play02:36

comparison number if you want to see how

play02:37

you did compared to everybody else a

play02:40

z-score is one way to go so you take

play02:44

basically it's calculated by taking the

play02:48

data value like you're you know if you

play02:51

if you ran a marathon right you wanted

play02:53

to see how did I do compared to

play02:54

everybody else well you could take your

play02:56

time in the marathon minus the mean

play02:59

average of the all the times in the

play03:01

marathon divided by the standard

play03:02

deviation of all the times in the

play03:04

marathon and you'd get a z-score and

play03:06

that z-score would be able to tell you

play03:09

how you did compared to everybody else

play03:11

that's kind of what we use this board

play03:14

now the data value is sometimes denoted

play03:18

as an X and I think if you remember when

play03:20

we were calculating mean and standard

play03:22

deviation I was using the letter X X

play03:26

minus the mean this so X is like your

play03:29

marathon time right the data value your

play03:31

that you're come that you're looking at

play03:33

and then the mean is X bar and s is

play03:37

standard deviation also sometimes you'll

play03:40

see the formula in stat books as X minus

play03:43

mu whenever that's the population mean

play03:45

divided by Sigma the population standard

play03:48

deviation to me though especially for

play03:50

intro students I would go with the word

play03:52

the word formula right that one you

play03:55

never get messed up so you know you

play03:56

remember that you know and I'm

play03:58

remembering all the letters yet but just

play04:00

name is the data value minus the mean

play04:02

divided by the standard deviation okay

play04:06

now if this z-score actually comes out

play04:09

positive the data value must have been

play04:12

above the mean and if the if the z-score

play04:16

comes out negative

play04:17

then the data value must be below the

play04:19

mean so kind of keep in mind

play04:21

that and when you're kind of explaining

play04:23

the z-score the positive z-score you're

play04:26

going to say above in the sentence and

play04:28

negative z-scores are going to say below

play04:30

in the sentence

play04:32

okay so just kind of keep that in mind

play04:34

so let's look at a quick couple quick

play04:36

examples so let's suppose we're going to

play04:39

look at IQ tests are normally

play04:42

distributed or normal with a mean of 100

play04:45

and a standard deviation of 15 right so

play04:49

a mean of 100 and the standard deviation

play04:52

of 15 so so let's suppose that we look

play04:57

at Maria's IQ and Maria's IQ came out to

play05:00

be 147 how is that how does that compare

play05:03

to everybody else that takes an IQ test

play05:05

well we could calculate the z-score for

play05:08

Maria so all you do is you put in the

play05:11

Maria's score 147 minus the mean so

play05:15

minus 100 and then divide by 15 right

play05:20

divided by 15 so if we did that 147

play05:24

minus 100 is is 47 and 47 divided by 15

play05:27

you get positive 3 point 1 3 3 3 3 3 3 3

play05:31

and that's the z-score if you notice you

play05:35

didn't really have to put this little

play05:36

positive sign I do that anytime a lot of

play05:39

times in certain statistics the positive

play05:43

and negative is really really important

play05:45

in terms of interpretation so a lot of

play05:47

times I will make sure to put a little

play05:49

symbol next to it just reminding myself

play05:52

that it's a positive value or it's a

play05:54

negative value that's really important

play05:56

with z-scores now if you'll notice I did

play05:59

round it I wryly round it to the

play06:03

hundreds place the second number to the

play06:06

right of the decimal not for really any

play06:08

good reason in the old days before

play06:12

computers we used to have these charts

play06:14

that you would look up things and that

play06:17

were organized by z-score and you would

play06:19

look up the z-score on these charts and

play06:22

the charts were always rounded to the

play06:24

hundreds place so if you're like me and

play06:27

you've been around for a while have you

play06:28

been kind of doing stats for a while you

play06:30

may be

play06:30

looking up stuff you may have looked up

play06:32

stuff on those charts and those charts

play06:34

were always rounded to the hundreds

play06:35

place so I think a lot of us old-timers

play06:38

that were been doing this before

play06:40

computers we not that I actually was

play06:43

doing charts before computers because

play06:45

computers have been invented but I have

play06:48

used the charts in the past and then

play06:50

again I those charts were grounded to

play06:52

the hundreds place so that's why I

play06:54

rounded it to hundreds plus now the more

play06:56

important part of this is what does this

play06:58

mean right what does this mean first of

play07:00

all notice the z-score was positive that

play07:02

means Maria's IQ was above the mean

play07:05

right she was above the average so if we

play07:09

look at that okay well positive means

play07:13

it's above right and but remember a

play07:16

z-score is not a percentage a z-score is

play07:19

not kilograms it's not dollars it's not

play07:21

miles a z-score is number of standard

play07:26

deviations that's why we often call it a

play07:28

standardizing score it's a way of

play07:31

comparing things when you may not

play07:32

understand like you might not understand

play07:35

the physics involved in in some data

play07:37

that maybe you got but maybe if but if

play07:42

you understand that in terms of number

play07:43

of standard deviations then you can

play07:45

still kind of make a night get an idea

play07:47

of what's going on so Maria's IQ is

play07:50

three point one three standard

play07:52

deviations above the mean now that would

play07:55

be how I would explain it notice I use

play07:58

the word above because my z-score was

play07:59

positive and obviously Maria's IQ was

play08:02

above the mean okay

play08:05

now you can use z-scores to figure out

play08:08

outliers and unusual values if you got

play08:12

if you guys remember when we did our

play08:13

mean average and standard deviation

play08:16

video and normal data we said that

play08:18

anything that's two standard deviations

play08:21

away from the mean or above is

play08:25

considered a high outlier and anything

play08:29

that's two standard deviations below the

play08:30

mean or or less would be considered a

play08:33

low outlier so if you translate that

play08:36

into a z-score that means your z-score

play08:38

would have to be greater than or equal

play08:40

to two for it to be unusually high and

play08:44

then z-scores are less than or equal to

play08:46

negative two when they're unusually low

play08:49

now later we'll see that you this two

play08:54

and negative two we can get a little

play08:56

more accurate with those later on we'll

play08:58

get into critical values and things like

play09:00

that but right now just having your head

play09:01

okay about two or more standard

play09:03

deviations away is unusual it's also

play09:08

considered significant so we'll kind of

play09:10

get into that later two z-scores are

play09:12

sometimes used for significance measures

play09:13

if you guys remember the typical values

play09:17

in a normal data are between our one

play09:20

standard deviation from the mean

play09:22

so that would be translated that into a

play09:25

z-score typical values would have a

play09:27

z-score between negative 1 and positive

play09:29

1 so let's go back to Moorea Moorea

play09:34

z-score was 3 point 1 3 which is

play09:37

definitely higher than 2 right so that

play09:41

means she was unusually high Maria has

play09:43

an unusually high IQ okay because her Z

play09:47

score was above 2 so that makes sense

play09:50

like you might not understand like when

play09:52

I looked at 147 I didn't know is that a

play09:54

lot or is that not a lot now I know it's

play09:56

a lot right because the z-score tells me

play09:59

let's look at another one so Rick's IQ

play10:03

is 87 what would be Rick's z-score

play10:06

well again you start with Rick's value

play10:09

right Rick's value is 87 you minus the

play10:12

mean and then you divide by the standard

play10:14

deviation so 87 minus 100 is negative

play10:18

13/15 we get negative zero point eight

play10:21

six six six six six six

play10:23

again I rounded it to the hundredths

play10:25

place so I got negative zero point eight

play10:28

seven

play10:29

now be careful this is not a proportion

play10:31

this is not a percentage do not convert

play10:33

that to 87 percent a z-score is number

play10:37

of standard deviations it's not actually

play10:40

a

play10:40

or a proportion you want to be very

play10:42

careful with that you leave it as

play10:45

negative 0.87 that means that Rick's IQ

play10:49

is point eight seven standard deviations

play10:51

below the mean below the mean

play10:55

notice again negative means below notice

play10:58

I didn't say negative 0.87 standard

play11:01

deviations below the negative tells me

play11:03

it's below the point eight seven tells

play11:06

me how many standard deviations so

play11:09

better to say that it's point eight

play11:11

seven standard deviations below the mean

play11:13

now where is Rick fall compared to other

play11:15

people well didn't we say any z-score

play11:18

between negative one and positive one

play11:20

would be considered typical and this IQ

play11:23

negative point eight seven is between

play11:25

negative one and positive one on the

play11:27

number line so Rick is actually very

play11:30

typical he has a typical IQ like a lot

play11:34

of people but I think we mentioned in

play11:36

the normal data section though that

play11:38

that's about the middle 68% so Rick's

play11:42

kind of in the middle 68% of people

play11:46

people's IQ okay now remember not we

play11:51

talked about this not everybody is

play11:52

unusual or typical there's people that

play11:55

are sort of in that middle ground so

play11:58

it's like suppose I have a z-score of

play11:59

1.5 well 1.5 is not typical it's it's

play12:04

not in the typical zone but it's also

play12:06

not unusual right because not two or

play12:08

above so a 1.5 z-score is not typical

play12:12

and it's not unusual right so don't

play12:15

think that everybody has to fall into

play12:17

typical or unusual okay all right so I

play12:21

hope this helped you as the z-scores

play12:23

we'll be talking more about z-scores

play12:24

throughout the class but this is just an

play12:26

introduction to calculating them in an

play12:29

introduction just sort of starting to

play12:30

explain them like I say will get more

play12:32

and more into z-scores throughout the

play12:34

class all right

Rate This

5.0 / 5 (0 votes)

Related Tags
Z-scoresstatisticsnormal datastandard deviationconfidence intervalstest statisticsoutliersIQ analysisproportionsdata analysis