Standard deviation (simply explained)

DATAtab
19 Sept 202107:48

Summary

TLDRThis video script explains the concept of standard deviation as a measure of data dispersion around the mean. It outlines the process of calculating standard deviation using two different formulas, depending on whether the data represents a sample or the entire population. The script also differentiates between standard deviation and variance, emphasizing the importance of using standard deviation for easier data interpretation. A tip is provided to calculate standard deviation using an online tool, Beta Tab, for convenience.

Takeaways

  • 📏 Standard deviation is a measure of how much data scatters around the mean, indicating the variability within a dataset.
  • 🧮 To calculate the mean, sum all individual values and divide by the number of individuals in the dataset.
  • 📉 Deviations from the mean are calculated by subtracting the mean from each individual data point.
  • 🔢 The standard deviation is found by taking the square root of the average of the squared deviations from the mean.
  • 🌐 There are two formulas for standard deviation: one for a population (dividing by n) and one for a sample (dividing by n-1).
  • 🔄 The use of n-1 in the sample formula provides an unbiased estimate of the population standard deviation.
  • 📊 The difference between standard deviation and variance is that variance is the square of the standard deviation, without taking the square root.
  • 🔗 The variance is the squared average distance from the mean, and it is the square of the standard deviation.
  • 📊 Using standard deviation is recommended over variance for data interpretation as it retains the original data's unit of measurement.
  • 🌐 An online tool like Beta Tab can be used to calculate standard deviation easily by inputting data into a table on the website.

Q & A

  • What is standard deviation?

    -Standard deviation is a measure of how much data scatters around the mean. It quantifies the average amount by which individual data points differ from the mean value.

  • How do you calculate the mean?

    -The mean is calculated by summing the heights (or any other data points) of all individuals and dividing it by the number of individuals.

  • What does the standard deviation tell us about the data?

    -The standard deviation tells us how much, on average, data points deviate from the mean, indicating the dispersion or spread of the data.

  • What is the formula for calculating standard deviation?

    -The formula for calculating standard deviation is the square root of the sum of the squared deviations of each data point from the mean, divided by the number of values (n) or n-1 (depending on whether it's a sample or the entire population).

  • Why are there two different formulas for standard deviation?

    -There are two formulas because one is used when you have the entire population (dividing by n) and the other is used when you have a sample of the population (dividing by n-1) to estimate the population standard deviation.

  • What is the difference between standard deviation and variance?

    -Variance is the squared average distance from the mean, while standard deviation is the square root of the variance. Essentially, variance is the squared standard deviation.

  • Why is standard deviation preferred over variance when describing data?

    -Standard deviation is preferred because it is in the same unit as the original data, making it easier to interpret and understand, whereas variance is in squared units which can be harder to interpret.

  • What is the relationship between the standard deviation and the original data units?

    -The standard deviation is always in the same unit as the original data, which helps in making the measure of dispersion directly comparable to the data.

  • How can one calculate standard deviation easily?

    -Standard deviation can be calculated easily using online tools like Beta Tab on datadept.net, where you simply copy your data into a table and select the variable to calculate.

  • What is the significance of the quadratic mean in standard deviation calculation?

    -The quadratic mean is significant because using the arithmetic mean would always result in zero deviation due to positive and negative deviations canceling each other out. The quadratic mean avoids this by squaring the deviations before averaging.

  • What tip does the video provide for calculating standard deviation?

    -The video suggests using online tools like Beta Tab on datadept.net for an easy calculation of standard deviation by simply entering the data into the provided table.

Outlines

00:00

📏 Understanding Standard Deviation

This paragraph introduces the concept of standard deviation as a measure of data dispersion around the mean. It explains how to calculate the mean, which is the sum of all individual heights divided by the number of individuals. The standard deviation quantifies how much each data point deviates from this mean, using an example where the mean height is 155 centimeters. The formula for standard deviation is presented, emphasizing the use of the square root of the sum of squared deviations divided by the number of values (n). The distinction between the arithmetic mean and the quadratic mean is highlighted, with the latter being essential for standard deviation calculations to avoid a zero result. The paragraph concludes with a note on the two formulas for standard deviation, one for the entire population (dividing by n) and one for samples (dividing by n-1), explaining the choice of formula based on whether the data represents a population or a sample.

05:02

🔍 Calculating Standard Deviation and Variance

The second paragraph delves into the difference between standard deviation and variance. It clarifies that while standard deviation measures the average distance of data points from the mean, variance measures the squared average distance. The relationship between the two is such that variance is the square of standard deviation, and vice versa. However, variance often results in a unit that does not align with the original data, making standard deviation, which retains the original unit, more interpretable and preferable for data description. The paragraph also provides a practical tip for calculating standard deviation using an online tool called Beta Tab, accessible via datadept.net, where users can input their data, select the variable, and easily obtain the standard deviation. The video ends with a farewell, inviting viewers to look forward to the next video.

Mindmap

Keywords

💡Standard Deviation

Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of data values. In the video, it is described as a measure of how much data scatters around the mean. It is calculated by taking the square root of the variance, which is the average of the squared differences from the mean. The video uses the example of measuring the height of a group of people to illustrate how standard deviation indicates the average amount by which each person's height deviates from the mean height.

💡Mean

The mean, often referred to as the average, is calculated by summing all the values in a data set and then dividing by the number of values. In the context of the video, the mean is used as a reference point to determine how much each individual's data point deviates. For instance, the mean height of a group is calculated to understand how much each person's height differs from the average height of the group.

💡Data Scatter

Data scatter refers to the distribution of data points around a central value, such as the mean. The video explains that standard deviation is a measure of this scatter, indicating how spread out the data points are. A larger standard deviation indicates greater scatter, while a smaller one indicates that the data points are closer to the mean.

💡Variance

Variance is a statistical measure that represents the average of the squared differences from the mean. It is used to quantify the dispersion of a set of data points. The video clarifies that variance is the squared form of standard deviation, and it is the average distance from the mean, but squared. The variance is less intuitive than standard deviation because it is in squared units, making standard deviation the preferred measure for data interpretation.

💡Sample

A sample is a subset of a larger population that is used to represent and make inferences about the whole population. The video discusses that when calculating standard deviation for a sample rather than the entire population, a different formula is used (n-1 in the denominator instead of n). This adjustment accounts for the fact that samples may not perfectly represent the population, and it helps to provide a better estimate of the population's standard deviation.

💡Population

In statistics, a population refers to the entire set of individuals or items of interest that are the subject of a study. The video mentions that if one had data for an entire population, such as the height of all American professional soccer players, a specific formula for standard deviation would be used (dividing by n). This is in contrast to using a sample to estimate population statistics.

💡Arithmetic Mean

The arithmetic mean is the sum of all values in a data set divided by the number of values. The video points out that if the arithmetic mean were used to calculate deviations for standard deviation, the result would always be zero because positive and negative deviations would cancel each other out. This is why the quadratic mean is used instead.

💡Quadratic Mean

The quadratic mean is the square root of the arithmetic mean of the squares of the data points. It is used in the calculation of standard deviation to ensure that all deviations (whether positive or negative) contribute to the measure of dispersion. The video explains that using the quadratic mean prevents the issue of deviations canceling each other out, which would happen if the arithmetic mean were used.

💡Sigma (σ)

In the video, sigma (σ) represents the standard deviation in the formula for calculating it. It is used to denote the symbol for standard deviation in mathematical expressions. The formula presented in the video involves summing the squared deviations from the mean, dividing by the number of values (n or n-1 depending on whether it's a population or sample), and then taking the square root, which is represented by sigma.

💡Beta Tab

Beta Tab is mentioned in the video as a tool for calculating standard deviation. It is an online resource where users can input their data, select the variable they wish to analyze, and easily obtain the standard deviation. This tool is highlighted as a practical way for viewers to apply the concepts discussed in the video to real-world data sets.

Highlights

Standard deviation measures how much data scatters around the mean.

Calculating the mean involves summing all data points and dividing by their count.

Standard deviation quantifies the average deviation from the mean.

The formula for standard deviation involves squaring the deviations and taking the square root.

The arithmetic mean cannot be used for standard deviation calculation as it would always result in zero.

There are two formulas for standard deviation: one for population and one for sample data.

The population standard deviation is calculated using the formula with division by n.

The sample standard deviation uses the formula with division by n minus one to estimate the population standard deviation.

Variance is the squared average distance from the mean, as opposed to the standard deviation.

Variance is the square of the standard deviation, and its unit does not correspond to the original data.

Standard deviation is preferred over variance for data interpretation due to its ease of understanding and same unit as the original data.

A tip is provided for calculating standard deviation using an online tool called Beta Tab.

Datadept.net is recommended for easily calculating standard deviation by inputting data into a table.

The video concludes with a reminder to use standard deviation for sample data interpretation.

The presenter invites viewers to watch more videos and says goodbye.

Transcripts

play00:00

today is about standard deviation after

play00:03

this video you will know what standard

play00:05

deviation is how you can calculate it

play00:08

and why there are two different formulas

play00:11

and finally what is the difference to

play00:13

the variance

play00:15

at the end of this video i have a tip

play00:17

for you so let's get started

play00:19

so what is the standard deviation the

play00:22

standard deviation is a measure of how

play00:24

much your data scatters around the mean

play00:28

so the standard deviation has something

play00:30

to do with the scatter of your data for

play00:33

example how different the answers of

play00:36

your respondents are

play00:38

here's an example

play00:40

let's say you measure the height of a

play00:42

small group of people

play00:45

the standard deviation tells us how much

play00:48

your data scatters around the mean

play00:51

so we first need to calculate the mean

play00:54

you can get a mean simply by summing the

play00:57

heights of all individuals and dividing

play01:00

it by the number of individuals

play01:04

let's say we get a mean value of

play01:06

155 centimeters

play01:09

now we want to know how much each person

play01:12

deviates from the mean

play01:15

so we look at the first person who

play01:17

deviates 18 centimeters from the mean

play01:20

value the second person deviates 8

play01:23

centimeters from the mean value

play01:26

and so on

play01:27

finally person number six deviates six

play01:31

centimeters from the mean value

play01:34

so simply said people that are very

play01:36

small or very tall deviate more from the

play01:40

mean value

play01:42

now of course you're not interested in

play01:44

the deviation of each individual person

play01:47

from the mean value

play01:49

but you want to know how much the

play01:51

persons deviate from the mean value on

play01:53

average

play01:55

so how much do these persons on average

play01:58

deviate from the mean value this is what

play02:01

the standard deviation tells us

play02:04

in our example the average deviation

play02:07

from the mean value is

play02:09

12.06 centimeters and now of course the

play02:12

next question is how can we calculate

play02:15

the standard deviation you can calculate

play02:18

the standard deviation with the

play02:20

following formula

play02:22

sigma is the standard deviation

play02:25

n is the number of persons

play02:28

x i is the size of one single person and

play02:31

x dash is the mean value of all people

play02:36

so the standard deviation is the root of

play02:39

the sum of square deviations divided by

play02:43

the number of values

play02:45

for our example this means that we

play02:47

calculate the size of the first person

play02:50

minus the mean and square that then the

play02:53

size of the second person minus the mean

play02:56

and then square that and so on until we

play02:59

arrive at the last person

play03:02

then we divide this number by the number

play03:04

of people so 6 and take the root of it

play03:09

the result is then

play03:11

12.06 centimeters

play03:14

so each individual person has some

play03:16

deviation from the mean

play03:18

but on average the people deviate 12.06

play03:23

centimeters from the mean

play03:25

which is now our standard deviation

play03:28

now you might notice one thing i always

play03:31

talk about the average deviation from

play03:34

the mean

play03:35

but for the average deviation i would

play03:38

actually just add up all deviations and

play03:41

divide it by the number of participants

play03:44

just like you calculate a mean value

play03:47

right

play03:48

you're absolutely right but there are

play03:50

different mean values

play03:52

in the case of the standard deviation

play03:55

it's not the arithmetic mean which is

play03:57

used

play03:58

but the quadratic mean

play04:00

if the arithmetic mean would be used the

play04:03

result would be zero every time

play04:07

so far so good but now there's one more

play04:09

thing to consider

play04:11

there are two slightly different

play04:13

formulas for the standard deviation in

play04:15

the first formula there is a deviation

play04:18

by n and in the other one there is a

play04:21

deviation by n minus one

play04:24

but why that why are there two different

play04:26

formulas

play04:28

usually you want to know the standard

play04:29

deviation of the whole population for

play04:32

example you want to know the standard

play04:34

deviation of hate of all american

play04:37

professional soccer players

play04:40

now if you had the hate of all american

play04:43

soccer players

play04:44

you would take this equation with one

play04:47

divided by n

play04:50

however it is usually not possible to

play04:52

investigate the entire population

play04:55

so you take a sample

play04:58

then you use this sample to estimate the

play05:01

standard deviation of the population

play05:04

in that case you use this formula

play05:07

therefore whenever you have data of the

play05:09

whole population and you want to

play05:12

calculate the standard deviation for

play05:14

just this data you use 1 divided by n

play05:18

therefore

play05:19

whenever you have data of the whole

play05:22

population and you want to calculate the

play05:24

standard deviation for just this data

play05:28

you use 1 divided by n

play05:31

if you only have one sample and you want

play05:34

to estimate the standard deviation you

play05:37

use n minus 1.

play05:39

so to keep it simple if your survey

play05:42

doesn't cover the whole population you

play05:45

always use the formula on the right side

play05:48

likewise if you have conducted a

play05:50

clinical study for example

play05:53

then you also use the formula on the

play05:55

right side to infer the population

play05:59

let's look at the next question now

play06:01

what is the difference between the

play06:03

standard deviation and the variance

play06:06

as you now know the standard deviation

play06:09

is the average distance from the mean

play06:12

the variance now is the squared average

play06:16

distance from the mean

play06:18

so we have one and the same formula the

play06:21

only difference is that in order to

play06:24

calculate the standard deviation we take

play06:26

the root

play06:27

in order to calculate the variance we

play06:30

don't do that to put it the other way

play06:32

around the variance is the squared

play06:35

standard deviation and the standard

play06:37

deviation is the root of the variance

play06:41

however this squaring results in a

play06:44

figure which is quite difficult to

play06:46

interpret

play06:48

since the unit of the calculated

play06:50

variance does not correspond to the

play06:53

original data

play06:54

for this reason it is advisable to

play06:57

always use the standard deviation to

play06:59

describe a sample as this makes

play07:02

interpretation a lot easier for you

play07:05

the standard deviation is always in the

play07:07

same unit as the original data

play07:10

in our example this would be centimeters

play07:13

and finally as promised i have a tip for

play07:15

you

play07:16

if you want to calculate the standard

play07:18

deviation you can easily do it online

play07:21

with beta tab

play07:23

just visit datadept.net

play07:26

copy your data into the table

play07:28

select the variable you want to

play07:30

calculate

play07:32

and afterwards you will get the standard

play07:34

deviation in a very easy way

play07:38

i hope you enjoyed the video and see you

play07:40

next time bye bye

play07:48

you

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
Standard DeviationData ScatterMean ValueStatistical MeasurePopulation SampleVariance CalculationData AnalysisQuadratic MeanSample EstimationOnline Calculation
هل تحتاج إلى تلخيص باللغة الإنجليزية؟