Percentiles, Quantiles and Quartiles in Statistics | Statistics Tutorial | MarinStatsLectures

MarinStatsLectures-R Programming & Statistics
8 Oct 201907:29

Summary

TLDRThis educational video delves into the concepts of percentiles, quantiles, and quartiles, explaining their significance in data analysis. It emphasizes the median as the 50th percentile and introduces the first (25th percentile) and third (75th percentile) quartiles, highlighting their roles in dividing data into quarters. The video also touches on how percentiles can be used to interpret individual data points within a dataset. The speaker advises focusing on understanding these concepts rather than the calculation methods, which are typically handled by statistical software like R. The video concludes with a mention of other divisions like terciles, quintiles, and deciles, suggesting their utility in different analytical scenarios.

Takeaways

  • 📊 Percentiles and quantiles are measures used to divide a dataset into 100 equal parts, with each part representing a percentage of the data.
  • 🔄 The terms 'percentile' and 'quantile' are often used interchangeably, although there is a slight difference between them.
  • 📈 Quartiles are specific types of percentiles or quantiles that divide the data into four equal parts, with the first quartile (Q1) representing the 25th percentile, the median (Q2) the 50th percentile, and the third quartile (Q3) the 75th percentile.
  • 🎯 The median, or 50th percentile, is a special case where 50% of the data points fall below this value and 50% above.
  • 📉 The first quartile (Q1) divides the dataset so that 25% of the data points are below it and 75% are above.
  • 📈 The third quartile (Q3) is where 75% of the data points are below this value and 25% are above, often considered the upper quartile.
  • 📊 Box plots are a graphical representation of the median, first quartile, third quartile, minimum, and maximum values of a dataset.
  • 🔢 Percentiles and quantiles can be calculated for any percentage, not just the common quartiles, to provide a detailed understanding of data distribution.
  • 🔄 Understanding the percentile a specific value falls into can help interpret how that value compares to the rest of the dataset.
  • 📊 Beyond quartiles, other divisions of data such as tertiles (thirds), quintiles (fifths), and deciles (tenths) can be used to analyze and summarize data.

Q & A

  • What is the difference between percentiles and quantiles?

    -While there is a slight subtle difference between the two, they can be used interchangeably for the most part. Percentiles and quantiles are measures that divide a dataset into 100 equal parts, with each part representing a percentage of the dataset.

  • What is the median in the context of percentiles?

    -The median is the 50th percentile, which means it has 50% of the ordered observations below it. It is the value that cuts the dataset in half.

  • What is the first quartile (Q1) and how is it calculated?

    -The first quartile, or Q1, is the 25th percentile, which has 25% of the observations below it. It divides the dataset into quarters, with one quarter below and three quarters above this value.

  • What does the third quartile (Q3) represent?

    -The third quartile, or Q3, represents the 75th percentile, having 75% of the observations below it. It divides the dataset so that three-quarters of the data are below this value.

  • How are quartiles used to divide a dataset?

    -Quartiles divide the dataset into four equally sized quarters. The first quartile (Q1) is the 25th percentile, the second quartile (the median) is the 50th percentile, and the third quartile (Q3) is the 75th percentile.

  • What is a box plot and how does it relate to quartiles?

    -A box plot is a graphical visualization that displays the median, first quartile (Q1), and third quartile (Q3), as well as the minimum and maximum values of a dataset. It provides a quick summary of the data's distribution.

  • Can percentiles or quantiles be used to determine how an individual data point ranks within a dataset?

    -Yes, percentiles or quantiles can be used to determine the rank of an individual data point within a dataset by identifying what percentage of observations fall below that specific value.

  • What is the purpose of calculating percentiles or quantiles?

    -Percentiles and quantiles are useful for summarizing the distribution of a dataset and for comparing individual data points to the overall dataset, providing insights into the relative standing of those points.

  • What are some other types of divisions of a dataset besides quartiles?

    -Besides quartiles, datasets can be divided into tertiles (three equal parts), quintiles (five equal parts), or deciles (ten equal parts), depending on the level of detail required for analysis.

  • Why might one use statistical software like R for calculating percentiles or quantiles?

    -Statistical software like R is used for calculating percentiles or quantiles because it can handle the complexity and variations in calculation methods, and it can process large datasets efficiently.

Outlines

00:00

📊 Understanding Percentiles and Quartiles

This paragraph introduces the concepts of percentiles and quantiles, highlighting that while there is a slight difference between them, they can generally be used interchangeably. The focus is on understanding the concept rather than the calculation, which is typically done using software like R. The video uses a small dataset of 13 student grades to illustrate these concepts. The median, which is the 50th percentile or quantile, is explained as the value that divides the dataset into two equal halves. The first quartile (25th percentile) and third quartile (75th percentile) are also discussed, with the video emphasizing the importance of these quartiles in dividing the data into quarters. The video concludes by mentioning that while quartiles are commonly used, any percentile can be reported, and it also touches on the idea of using percentiles to understand the position of a specific value within a dataset.

05:02

📈 Applying Percentiles to Evaluate Data Points

The second paragraph delves into how percentiles can be used to evaluate the position of a specific data point within a dataset. It explains that knowing the average and the spread of the data is crucial for understanding where a particular value stands. The paragraph uses the example of a grade of 82 to illustrate how its percentile ranking can vary depending on the dataset's range and average. The video also mentions that percentiles are not just for defining a specific value but can also be used to determine what percentile a given value falls into. Lastly, the paragraph briefly introduces the concepts of tertiles, quintiles, and deciles as alternative ways to divide data, suggesting that quartiles are a common choice but other divisions might be used depending on the context.

Mindmap

Keywords

💡Percentiles

Percentiles are statistical measures that divide a dataset into 100 equal parts, with each part representing one percent of the data. In the video, percentiles are discussed as a way to understand the distribution of grades among students. For instance, the 50th percentile, also known as the median, is the value that divides the dataset in half, with 50% of the grades below and 50% above it.

💡Quantiles

Quantiles are similar to percentiles but divide the dataset into equal parts that do not necessarily sum up to 100. They are used to divide the data into quarters (quartiles), thirds (tertiles), fifths (quintiles), etc. The video script mentions that while there's a slight difference between percentiles and quantiles, they can often be used interchangeably, especially when discussing concepts rather than specific calculations.

💡Quartiles

Quartiles are specific quantiles that divide a dataset into four equal parts. The first quartile (Q1) represents the 25th percentile, the second quartile (Q2) is the median or 50th percentile, and the third quartile (Q3) represents the 75th percentile. The video uses quartiles to explain how data can be divided into quarters, providing a clear picture of the distribution of grades among students.

💡Median

The median is the middle value in a dataset when the numbers are arranged in ascending order. It is also referred to as the 50th percentile. The video script uses the median to illustrate how to find the value that divides the dataset into two equal halves. For example, in the dataset of student grades, the value of 77 is identified as the median because it has an equal number of grades below and above it.

💡First Quartile (Q1)

The first quartile, or Q1, is the value below which 25% of the data falls. It represents the 25th percentile. In the context of the video, Q1 is used to show how the data can be divided into quarters, with one quarter of the grades falling below this value. The video script approximates Q1 to be around a grade that would place one quarter of the students below it.

💡Third Quartile (Q3)

The third quartile, or Q3, is the value below which 75% of the data falls, representing the 75th percentile. The video script uses Q3 to demonstrate how to find the value that has three-quarters of the data below it. This is important for understanding the upper range of the dataset, as it shows where the top 25% of grades lie.

💡Box Plot

A box plot is a graphical representation of the median, first quartile, and third quartile in a dataset. It provides a visual summary of the central tendency and dispersion of the data. The video script mentions box plots as a way to visualize the quartiles, with the 'box' representing the interquartile range (the space between Q1 and Q3) and 'whiskers' extending to the minimum and maximum values.

💡Minimum and Maximum

The minimum and maximum values of a dataset represent the smallest and largest observations, respectively. These are often referred to as the 0th and 100th percentiles. The video script includes these values in the discussion of percentiles and quartiles, emphasizing their importance in understanding the full range of a dataset.

💡Statistical Software

Statistical software, such as R, is used for calculating percentiles, quartiles, and other statistical measures. The video script mentions that these calculations are typically done using software, which can handle different methods of calculation, rather than by hand. This highlights the practical application of these concepts in data analysis.

💡Interpreting Grades

The video script uses the context of student grades to illustrate how percentiles can be used to interpret individual scores. For example, understanding that a grade of 82 falls into the 80th percentile provides insight into how that grade compares to the rest of the class. This demonstrates the practical use of percentiles in educational settings to gauge performance.

Highlights

Percentiles and quantiles are discussed, with a slight difference but mostly used interchangeably.

Quartiles are specific values of percentiles or quantiles.

Focus is on the concept rather than the calculations, as software is typically used for calculation.

R statistical software has multiple methods for calculating percentiles or quantiles.

The median, or 50th percentile, is introduced as a value that divides data into 50% below and 50% above.

The first quartile (Q1), or 25th percentile, divides data into quarters, with 25% below and 75% above.

The third quartile (Q3), or 75th percentile, is another key quantile with 75% of observations below it.

Box plots visualize median, first quartile, third quartile, minimum, and maximum values.

Quartiles divide data into four equally sized quarters and are commonly reported.

Percentiles can be any value, not just quartiles, to describe the distribution of data.

Understanding percentiles helps interpret individual data points in the context of the dataset.

The video discusses how to determine the percentile rank of a specific value within a dataset.

Other divisions of data include tertiles, quintiles, and deciles, each breaking data into different group sizes.

The importance of knowing what quantiles or percentiles are and their practical applications is emphasized.

Software is typically used for calculating percentiles and quartiles, rather than manual calculation.

Transcripts

play00:00

in this video we're going to talk a bit

play00:02

about percentiles and quantiles as well

play00:05

as quartiles so percentiles and

play00:07

quantiles while there is a slight subtle

play00:09

difference between the two we can use

play00:11

them interchangeably for the most part

play00:13

and then we'll talk about quartiles

play00:15

which are specific values of a

play00:17

percentile or quantile a quick reminder

play00:20

to subscribe and click on the bell to

play00:22

receive notifications when we upload new

play00:24

videos we're going to focus on the

play00:27

concept and not the calculations and

play00:29

that's for a few reasons the first being

play00:31

that typically we're going to calculate

play00:33

these using a piece of software and

play00:35

we're not going to do it by hand and the

play00:37

second reason is the software that I use

play00:39

I use these statistical software R it

play00:42

has nine different ways of calculating a

play00:45

percentile or quantile so we don't want

play00:47

to get stuck on you know the

play00:48

technicalities of way one verse way to

play00:51

versus way three we want to focus on the

play00:53

concept of what are they and what are

play00:54

they useful for so in order to do this

play00:57

I've got this example here looking at

play00:59

the grades for 13 students I've kept the

play01:02

dataset small and simple so that we can

play01:05

focus on the concepts so here we got the

play01:07

13 grades as well as I place them on a

play01:10

number line here for visualization so

play01:12

let's start with a specific or a

play01:14

commonly looked at percentile or

play01:16

quantile the fiftieth percentile

play01:20

so this gets a special name and this

play01:23

gets called the median is just the 50th

play01:29

percentile or quantile again this has

play01:38

50% of the ordered observations below it

play01:43

it's looking at which value cuts the

play01:46

data in half if we look at it here the

play01:48

value of 77 has 1 2 3 4 5 6 below and 6

play01:53

above ok so this is the value that cuts

play01:56

the data set in half 50% below 50% above

play02:00

ok so that's looking at this right here

play02:03

half below 1/2 bit above now let's um

play02:06

talk about another commonly looked at

play02:09

percentile or quantile the 25th

play02:11

percentile

play02:13

and this again gets its own special name

play02:16

it gets called the first quartile or

play02:21

abbreviated q1 okay what this is well

play02:25

it's the 25th percentile or the

play02:28

twenty-fifth quantile and what that

play02:31

means is it has 25 percent or one

play02:34

quarter of observations below it so the

play02:39

median is the value that cuts the date

play02:41

in half half below half above the first

play02:43

quartile cuts it into 1/4 1/4 below and

play02:47

three quarters above okay well we said

play02:50

there's slightly different ways of

play02:51

calculating exactly what this value is

play02:53

we can see it falls roughly in here

play02:56

right this would cut 1/4 below 3/4 above

play02:59

and will not get stuck on the details of

play03:01

is it 67 or 64 what value exactly in

play03:06

between but it's falling roughly around

play03:08

here let's label this this is q1 this is

play03:12

the median forgot to label that earlier

play03:15

again another important percentile

play03:17

writer quantile is the 75th percentile

play03:22

again this gets its own special name it

play03:26

gets called the third quartile the third

play03:31

quarter so again abbreviated q3 and this

play03:36

here is the 75th percentile or the 75th

play03:42

quantile and this has 75% or

play03:46

three-quarters of observations below it

play03:51

and again locating stuck on the exact

play03:54

number looks like it's roughly around

play03:56

here it cutting it to have three

play03:57

quarters below 1/4 above so it's kind of

play04:01

in the range both there so this is the

play04:03

third quartile

play04:04

under the 75th percentile or 75th quanto

play04:08

some other important I guess points to

play04:11

mention are the minimum value as well as

play04:14

the maximum right are the zero and the

play04:16

hundredth percentile now something

play04:18

encountered in a separate video but

play04:20

worth mentioning here is that the box

play04:23

plot is actually a visualization a

play04:26

graphical visualize

play04:27

of the median first quartile third

play04:29

quartile minimum and Max so it draws a

play04:32

box on these and a line extending to

play04:35

those now quartiles are commonly used

play04:38

percentiles or quantiles as they divide

play04:40

the data into four equally sized

play04:42

quarters and there are a common

play04:44

description you see but really you can

play04:46

report any value of percentile or

play04:48

quantile so just as an example the 90th

play04:52

percentile this gives us the value that

play04:57

90% of observations are below again the

play05:02

40th percentile which value are 40

play05:04

percent of observations less than okay

play05:07

so the first quartile and third quartile

play05:09

or the 25th and 75th percentile those

play05:13

ones are often reported as they're kind

play05:14

of nice percentiles to look at but

play05:17

really we can report any percentile we

play05:19

want it's also important to note that

play05:21

right now our discussion has been on

play05:23

defining a percentile say the 75th

play05:26

percentile and finding out which value

play05:28

that corresponds to we can also look at

play05:30

it the other direction we might take an

play05:32

observed value say something like this

play05:35

here right the 82 and try and decide

play05:38

what percentile is that value care in

play05:42

other words if I told you this someone

play05:44

scored a grade of 82 it's really hard to

play05:46

know is that a high grade or low grade

play05:48

you need to know what was the average

play05:49

you also need to know how spread out are

play05:51

things where and if I tell you someone

play05:53

scored a grade of 82 and grades range

play05:56

somewhere between 50 percent up to a

play05:58

hundred percent with an average say of

play06:00

80 the grade of 82 is fairly average

play06:03

right slightly above the mean of 80 if I

play06:07

were to tell you there's a difference

play06:07

class right they also had a mean of 80

play06:10

but the lowest grade say was 75 the

play06:13

highest grade was 83 that's actually a

play06:16

really high grade right there up at the

play06:18

the top end of the range so suppose I

play06:21

tell you that grade of 82 percent fell

play06:24

in the on the 80th percentile right now

play06:27

you know that that grade is higher than

play06:30

80 percent of the class okay so again

play06:32

you can take observations and find out

play06:34

what percentile they fall into so as we

play06:37

noted through this video in a real world

play06:39

you're probably never going to calculate

play06:41

any of these by hand and you're gonna

play06:42

use a piece of software to do that but

play06:44

it's important to know what a quantile

play06:46

or percentile is and what they're useful

play06:50

for one final thing to close on apart

play06:53

from quartiles sometimes you'll hear

play06:55

reported tersh aisles these break the

play06:58

data into the lowest third middle third

play06:59

up will fear upper third or quintiles

play07:02

break it into five groups or sometimes

play07:04

deciles

play07:05

break into ten equally sized groups so

play07:07

break into quarters is a common one to

play07:09

look at but you might hear of other ones

play07:12

if they fit almost as yummy as chocolate

play07:17

subscribe to our Channel

play07:19

Sharyl videos stick around guys cuz we

play07:23

saw lots more

Rate This

5.0 / 5 (0 votes)

Related Tags
PercentilesQuartilesData AnalysisStatistical ConceptsMedianFirst QuartileThird QuartileBox PlotEducational ContentStatistical Software