# The Population Bag Example

### Summary

TLDRThis video script delves into a hypothetical 'population bag' example to illustrate the concepts of Type 1 and Type 2 errors in statistical hypothesis testing. It explains the scenario of sampling from two bags, one with and one without removed chips, to demonstrate how researchers might incorrectly reject or fail to reject a null hypothesis. The script clarifies why in real-world applications, managing Type 2 errors is more critical than Type 1 errors, as the null hypothesis of identical population means is never exactly true, unlike in the classroom example where it can be.

### Takeaways

- 📚 The video discusses the 'population bag example', which is a hypothetical scenario used to illustrate concepts in hypothesis testing.
- 🎓 The example involves two population bags, each containing chips with numbers from one to five, and the task is to determine if the bags are the same or different based on samples.
- 🔍 The video introduces a null hypothesis that the two bags have the same mean, and an alternative hypothesis that they do not.
- 🧐 The decision rule is set with an alpha level of 0.1, meaning the null hypothesis will be rejected if the observed sample mean difference is unlikely (p < 0.1).
- 🔢 The video explains the process of taking samples from each bag, calculating the mean difference, and then determining a T value and p-value to test the hypothesis.
- 🔄 The process is repeated 16 times to simulate multiple experiments, highlighting the variability in outcomes and the potential for errors in hypothesis testing.
- 🚫 Type 1 errors occur when the null hypothesis is incorrectly rejected (false positive), and Type 2 errors occur when the null hypothesis is incorrectly accepted (false negative).
- 🤔 The video emphasizes that in real-world scenarios, the null hypothesis is never exactly true, making Type 1 errors impossible and Type 2 errors more critical to manage.
- 🌐 The 'population bag example' contrasts with real-world situations where the effect size (difference between population means) is often small, increasing the likelihood of Type 2 errors.
- 📉 The video concludes by answering a specific question about why beta (Type 2 error rate) is more important than alpha (Type 1 error rate) in real-world applications, highlighting the practical implications of hypothesis testing.

### Q & A

### What is the 'population bag example' described in the video?

-The 'population bag example' is a hypothetical scenario used to illustrate the concept of hypothesis testing. It involves two bags, each containing chips with numbers on them. The bags can either have the same chips or one bag has some chips removed, altering the mean value. The task is to determine, through sampling and statistical analysis, whether the two bags have the same or different chips.

### What is the significance of the mean and standard deviation in the population bag example?

-In the population bag example, the mean and standard deviation of the chips in the bags are crucial. The mean represents the average value of the chips, and the standard deviation measures the spread of the values. These statistical measures help in understanding the distribution of the chips and are used in hypothesis testing to determine if the samples are drawn from bags with the same or different chips.

### What are the two situations described in the population bag example?

-The two situations in the population bag example are: 1) Both bags have the same chips (e.g., six ones, twos, threes, fours, and fives), and 2) One bag has the same chips as the population bag, while the other has some chips removed (e.g., three ones, six twos, six threes, six fours, and six fives), resulting in a different mean.

### Why is it important to calculate the mean difference between the samples in the population bag example?

-Calculating the mean difference between the samples is important because it helps in making a judgment about whether chips have been removed from the bag. If the two bags have the same chips, the means of the samples should be similar. However, if chips have been removed, the mean of the bag with removed chips will be higher, and the sample means will likely differ.

### What is the null hypothesis in the hypothesis testing procedure described in the video?

-The null hypothesis in the hypothesis testing procedure is that the two bags are exactly the same, meaning the mean of the chips in bag one is the same as the mean of the chips in bag two. This assumption of no difference is what is being tested against the alternative hypothesis.

### What is the decision rule used in the hypothesis testing procedure, and what is its significance?

-The decision rule used in the hypothesis testing procedure is that if the p-value is less than alpha (set at 0.1 in the example), the null hypothesis is rejected. This rule is significant because it determines whether the observed sample mean difference is statistically significant enough to conclude that the bags are different.

### What are type 1 and type 2 errors in the context of hypothesis testing?

-In hypothesis testing, a type 1 error occurs when the null hypothesis is rejected when it is actually true. A type 2 error occurs when the null hypothesis is not rejected when it is actually false. In the population bag example, type 1 errors would be incorrectly concluding that the bags are different when they are the same, while type 2 errors would be incorrectly concluding that the bags are the same when they are different.

### Why is beta more important than alpha in real-world situations but not in the population bag example?

-In real-world situations, the means of two populations are never exactly the same, so the null hypothesis is never strictly true. This means that the only kind of error that can be made is a type 2 error (failing to reject a false null hypothesis). Therefore, managing the probability of a type 2 error (beta) is more important. In the population bag example, however, the null hypothesis can be true, making both type 1 and type 2 errors possible.

### What is the effect of sample size on the probability of making a type 2 error?

-The sample size has a significant impact on the probability of making a type 2 error. A smaller sample size reduces the power of the test, making it more likely to fail to detect a true effect (i.e., make a type 2 error). In the population bag example, the small sample size of 10 chips from each bag contributes to the high probability of type 2 errors.

### How does the effect size influence the probability of making a type 2 error in the population bag example?

-The effect size, which is the difference between the population means, plays a crucial role in the probability of making a type 2 error. If the effect size is small, the test is less likely to detect a difference even if one exists, increasing the likelihood of a type 2 error. In the population bag example, the small difference in means between the bags with and without removed chips contributes to the high probability of type 2 errors.

### Outlines

### 🎒 Introduction to the Population Bag Example

The speaker introduces a hypothetical scenario known as the population bag example, which is relevant to a short answer question (number 25) in an upcoming quiz. The scenario involves two bags, each potentially containing the same set of chips, with one bag possibly having some chips removed. The purpose is to determine whether two samples are from bags with identical chips or not. The example uses hypothesis testing to decide if the mean difference between two samples suggests that chips have been removed from one bag, affecting its mean value. The video will guide through the process of sampling, calculating means, and using those to infer about the population characteristics.

### 🔍 Hypothesis Testing and Decision Making

This paragraph delves into the specifics of the hypothesis testing process. The null hypothesis posits that the two bags are identical in terms of the chips they contain. An alpha level of 0.1 is set as the decision rule for rejecting the null hypothesis, based on the probability (p-value) associated with the observed sample mean difference. The scenario of conducting 16 separate experiments is introduced, where in most cases, the p-value does not fall below the alpha threshold, indicating no significant difference between the bags' means. However, two instances of type 1 errors are identified, where the null hypothesis is wrongly rejected despite the bags being identical, aligning with the expected error rate of 10%.

### 📈 Understanding Type 1 and Type 2 Errors in Different Scenarios

The speaker explores the implications of type 1 and type 2 errors in the context of the population bag example and contrasts it with real-world scenarios. In the bag example, the null hypothesis can be true, allowing for the possibility of type 1 errors. However, in real-world applications, the null hypothesis of no difference is usually an idealization and not strictly true, making type 1 errors a theoretical concern rather than a practical one. The focus then shifts to type 2 errors, which occur when the null hypothesis is not rejected despite it being false. The speaker illustrates this with a scenario where the bags have different means, and the sampling procedure fails to detect this difference, leading to multiple type 2 errors and a high error rate.

### 🌐 Real-World Significance of Beta Over Alpha

The final paragraph addresses question number 25, explaining why beta (the probability of a type 2 error) is more critical in real-world applications compared to alpha. The speaker clarifies that in real-world scenarios, it is impossible for two populations to have identical means, making the null hypothesis of no difference between means always false to some extent. As a result, the potential for type 1 errors is minimal, whereas the risk of type 2 errors—failing to detect a real difference—is substantial. The video concludes by emphasizing the importance of managing type 2 errors in practical research and decision-making.

### Mindmap

### Keywords

### 💡Population Bag Example

### 💡Hypothesis Testing

### 💡Type I Error

### 💡Type II Error

### 💡Alpha Level

### 💡Beta

### 💡Sample Mean Difference

### 💡T Value

### 💡P-Value

### 💡Effect Size

### 💡Null Hypothesis

### Highlights

Introduction to the 'population bag' example used to explain statistical concepts in hypothesis testing.

Reference to short answer question number 25, which asks why beta is more important than alpha in the real world but not in the population bag example.

Description of two hypothetical situations involving two population bags with either identical or different chips.

Explanation of the task for researchers to determine if samples are from bags with the same or different chips based on mean differences.

Hypothesis testing procedure using the logic of rejecting the null hypothesis if the p-value is less than a set alpha level.

Sampling 10 values from each bag and calculating the sample mean difference and associated T value.

Decision rule where a p-value less than alpha (0.1) leads to the rejection of the null hypothesis.

Illustration of conducting 16 separate experiments to understand the occurrence of type 1 and type 2 errors.

Example of not rejecting the null hypothesis when it is true, which is not an error in hypothesis testing.

Demonstration of type 1 errors where the null hypothesis is incorrectly rejected despite being true.

Calculation of the probability of making a type 1 error, which is expected to be around the alpha level set.

Explanation of type 2 errors where the null hypothesis is not rejected even though it is false.

Discussion on the high rate of type 2 errors due to small effect size and sample size.

Clarification on why beta (type 2 error) is more critical in real-world scenarios where the null hypothesis is never exactly true.

Comparison between the population bag example and real-world situations regarding the plausibility of the null hypothesis being true.

Final summary explaining the importance of managing type 2 errors in practical applications of hypothesis testing.

### Transcripts

in this video I'm going to describe what

I call the population bag example now

this example is made reference to in the

short answer questions they're going to

be eligible for the last quiz and the

particular short answer question is

number 25 and in that question I asked

why is bata more important than alpha in

the real world but not in the population

bag example that we did in class and so

this population bag example that's what

I'm going through in the video ok now I

want you to imagine a case here in which

there are two population bags now there

are two situations here with these two

bags in one situation the two bags could

have exactly the same chips in them so

in one of the situation's you've got one

bag has 6-1 6-2 6-3 6-4 and six fives

that's your population bag that has a

mean of three and a standard deviation

of one point four one four and then the

other bag in the pair here has exactly

the same chips in it as the first bag so

so one situation is you have two

population bags both population bags

have the same chips in them and those

chips are the chips that you have in

your actual population bag now six ones

twos threes fours and fives now the

other situation is that one of the bags

is the same as your population bag now

six ones twos threes fours and fives and

the other bag is like this the other bag

has had chips removed so there aren't

six ones twos threes fours and fives

anymore

there's three ones six two six three six

four and six fives so in this bag there

have been some chips removed and what

has been removed three ones three low

chips okay so you've got these two

situations situation one the two bags

are the same and situation two you have

one bag with no chips removed and one

bag with chips removed now the problem

for you as the researcher in this

situation is to guess which situation

you have

so you have to guess whether or not the

two samples that you've drawn have been

drawn from bags with the same chips in

them or whether or not the two samples

have been drawn from bags with different

chips in them okay so that's your job as

a researcher so what's going to happen

here is we're going to take a sample out

of each of two bags and so you're gonna

have two samples of chips we're gonna

calculate the mean of each of the

samples and we're going to use the mean

difference between the samples to make a

judgment about whether or not chips have

been removed from the bag now if you're

sampling from two bags that are the same

you should expect the means of the two

samples that you get to be pretty

similar that's because you're sampling

from bags that have the same chips in

them but if you're sampling from one bag

in which no chips have been removed

there's six ones two threes fours and

fives and you're sampling from another

bag in which chips have been removed you

would expect the sample mean in the bag

where chips have been removed to be a

little different from the sample mean in

the bag where chips have not been

removed now specifically what would you

expect well if you've taken out three

low chips here the mean of the chips in

the bag is going to be a little higher

it's three point two because you've

removed three chips with a value of one

you've removed low numbers so the mean

is going to go up so you would expect

the means of the samples drawn from a

bag with a mean of three point two to be

on average a little bit higher than the

means of the samples drawn from a bag

with no chips removed from the bag with

a mean of three so what we're going to

do here is we're going to go through a

hypothesis testing procedure we're going

to use the logic bypasses testing to

make a decision about whether or not

these two samples that you're going to

draw have been drawn from bags are the

same or drawn from bags that are

different all right so here's the

hypothesis testing logic here you can

see that we start with a null hypothesis

and so your null hypothesis is going to

be that the two bags are exactly the

same okay so that the me

on the chips in bag one it's the same as

the mean on the chips in bag two so the

mean is three here and the mean is three

in your second bag okay so you're going

to start with the null assumption of no

difference between the bags now we've

got a decision rule here and the

decision rule is if P less than alpha

and we're gonna set alpha 0.1 we're

going to reject the null hypothesis so

if the probability of the sample mean

difference we observe is small if the

null hypothesis is true we're going to

conclude that the null hypothesis is not

true now the study we're going to do is

to sample 10 values from each bag so

we're gonna sample 10 values from one

bag and 10 values from the other bag

we're going to calculate the sample mean

difference so the difference between the

means of these two samples and then

we're going to calculate a T value now

this is the T value for an independent

groups t-test that we saw in a previous

video and we're going to use that T

value to calculate the probability of

observing the sample mean difference

that we observed given that the two

samples have been drawn from bags with

exactly the same means so given that the

two samples have been drawn from these

two bags here now we're then going to

make a decision and the decision that

we're going to make is that if the p

value is less than our alpha of 0.1

we're going to reject the null and we're

going to conclude that we've sampled

from bags with different means all right

now what I want you to imagine in this

situation is that we're not just doing

this sampling procedure once let's

imagine that we do it 16 times so we do

16 separate experiments here okay and

let's imagine we get a result like this

okay so imagine there are 16 experiments

16 cases in which we draw samples of

size 10 from each of the two bags and

imagine that these are cases in which no

chips have been removed from these two

bags okay so let's imagine we draw ten

chips out of this bag we draw ten chips

out of this bag we calculate the mean of

the numbers on the chips in this bag and

the mean of the No

the chips in this bag and we find the

difference between the two sample means

that's x-bar 1 minus x-bar to imagine

that mean difference is 0.15 we then

calculate a T value and get the

Associated p-value the probability of

this sample mean difference or something

bigger

given that the two samples have been

drawn from bags with equal means which

in this case is true because we are

actually sampling from two bags with the

same mean now let's imagine that in the

first case that p-value is 0.82 so that

means that there's a point 8 2 or 82

percent chance of getting a sample mean

of 0.15 or bigger if we've sampled from

bags with the same means now that

probability is very high so the

probability of what we've observed is

pretty high if the null is true

therefore we don't reject the null we've

observed something likely if the null is

true so why would we reject the null so

we don't reject the null now question is

is that an error well in fact it's not

an error we have sample from bags with

the same means so the null is actually

true in this case we have not rejected

the null so we've not made an error

okay now let's imagine we do this whole

procedure again so we throw the chips

back in the bag and we sample another 10

chips from this bag and another ten

chips from this bag we calculate the

sample mean difference let's imagine

it's minus point to calculate a p-value

make our decision P is not less than our

alpha point 1 so we do not reject the

null and we have not made an error so

here I've done this procedure 16 times

and if we go down the list we see that

there are only two cases here in which

the p-value that we calculated the

probability of the sample mean

difference we observed is less than 0.1

that's this case here and this case here

now in this case the sample mean

difference was 0.8 3 the difference

between the sample means was relatively

speaking quite large and if you're

sampling from bags that have the same

means you wouldn't expect to get large

sample mean differences it's unlikely

to get large sample mean differences so

we've got a sample mean difference of

0.8 3 here what's the probability of

that 0.09 it's it's getting on to

getting to the low side so because we've

set an alpha point 1 we reject the null

hypothesis that we have sampled from

bags that have the same means but of

course we're wrong because we have

concluded that we've not sampled from

bags with the same means and we actually

have so that's an error now this kind of

error is a type 1 error we have rejected

the null hypothesis but the null

hypothesis is actually true we sampled

from bags with the same mean so that's a

type 1 error there and if you look down

we make another type 1 error here see

here we got a pretty large sample mean

difference we wouldn't expect that if

we're sampling from bags with the same

means and the chance of that is pretty

low and so because it's less than our

alpha point no one reject the null and

we've made another type 1 error so if

you look down the list here you see that

we've made two type 1 errors now the

probability of making a type 1 error is

0.1 meaning we would make an a type 1

error about 10 percent of the time

now we've done 16 experiments here we've

made 2 type 1 errors that's you know

it's not 10 percent but it's around 10

percent of the time that we've made a

type 1 error so that result is what we

would sort of expect here we've made

type 1 errors about as often as we would

expect to make them in this situation ok

now let's imagine that we're in this

second circum circumstance let's imagine

that we are sampling from this bag one

of the samples comes from this bag and

the other sample comes from this bag so

now we are not sampling from populations

that have the same means now let's

imagine that we went through 16

experiments again so we did the sampling

procedure 16 times and got these results

okay so in the first case take 10 values

from here take 10 values from here

calculate the mean of the 10 values that

came from this population the mean of

the 10 values that came from this

population calculate the mean difference

imagine that the mean difference is 0.4

3 now we calculate the p-value the

probability of observing this sample

mean

difference if we'd sample from

populations with the same means so this

p-value that we've calculated here is

the probability of seeing a sample mean

difference of 0.4 3 had we drawn our

samples from these two bags

now that's 0.65 that's not less than

alpha of 0.1 so we do not reject the

null hypothesis now in this case that's

an error we have not concluded that the

null is false but the null is actually

false because one of the samples came

from a bag with a mean that is different

than the other population so mu bag 1 is

not equal to MU bag 2 in this case and

so the null is false we haven't rejected

the null and so we have made an error

now this kind of error is an error in

which you do not reject the null and the

null is false that's a type 2 error so

we've made a type 2 error here now if

you go down to the next mean difference

is 0.68 the p-value is 0.45 we do not

reject the null again an error now here

we got a mean difference of 0.92 so

that's a fairly large mean difference

the probability of getting a mean

difference that large or larger if we

sample from bags with the same means as

point zero seven so it's unlikely to see

a mean difference that big if we've

sampled from bags with the same means

so we conclude we have not sampled from

bags from the same means we reject the

null hypothesis now is that an error no

it's not an error we've rejected the

null and the null is false now if you go

down the list here you'll see that we

have only been correct three times out

of 16 we've been incorrect 13 times out

of 16 so the rate of error is 13 out of

16 in this particular example that is

analogous to beta so this is the

probability here that we do not reject

the null when the null is false so the

problem is that we've made a lot of type

2 errors and so the question is well why

is that it doesn't seem that the

decision procedure we're using is a very

good decision

procedure if it leads to so many errors

okay well part of the reason for this is

that the mean in the bag in which chips

have been removed is not very different

from the mean in the bag where chips had

not been removed the effect size the

difference between the population means

is not very big and when the effect size

is not very big beta is big another

reason is that the sample size is small

we only drew 10 values out of this bag

and 10 values out of this bag so we have

a combination of a small effect size and

small sample size and that's what's

causing us to have a large probability

of a type 2 error here okay now to

answer question number 25 then let's

just take a look at it again

why is beta more important than alpha in

the real world but not in the population

bag example we did in class we just need

to understand the difference between

this situation and a real-world

situation now in this situation the null

hypothesis can actually be true it can

be the case that the means of the

populations are exactly the same if you

have six ones twos threes fours and

fives here and six ones twos threes

fours and fives here then the mean is

exactly three in both cases so this can

be true the two means can be exactly the

same in this example but in the real

world the means of two populations are

never exactly the same let's imagine we

have the null hypothesis that the mean

of population 1 minus the mean of

population 2 is 0 now we mean here

exactly zero so zero point zero zero

zero all the way out this is what's

called a point hypothesis and a point

type ah this is a specific exact

mathematical statement of the difference

between two parameters so here the

statement is that the two parameters are

exactly the same now if you imagine a

real-world case that we would be

imagining here let's say we imagine

population one is taking some kind of a

placebo and we imagine popular

- is taking some kind of a drug now the

hypothesis here is that even though

everybody in this population took a

placebo and everybody in this population

took a drug their scores on some measure

like for instance a depression measure

are exactly the same they're not

different by point zero zero zero zero

one they're exactly the same now of

course this drug has an active

ingredient in it this does not have an

active ingredient in it so these two

populations are being treated

differently so they may have a mean

that's very similar to each other but

they will not be exactly the same and

that is because they have been treated

differently now this is analogous to the

idea we spoke about in class a long time

ago

which was the idea that a coin could be

perfectly akley weighted on both sides

so recall that we had a probability

distribution like this with 0.5 here and

PX here and the issue was whether or not

this could describe a real coin or

whether or not this is an idealization

of the coin this is a model for a coin

but couldn't actually perfectly

represent a real coin now recall that I

said that there is no real coin in the

real world that is perfectly equally

weighted on both sides so it cannot be

that a null hypothesis like this that

the probability of a head let's say

equals 0.5 it cannot be that a point

hypothesis like that is actually true in

the real world because it can't be the

case that there's such a thing as a real

coin that is perfectly equally weighted

on both sides this is an idealization of

a coin just like this is an idealization

of the difference between two

populations okay so in the real world

there is no such thing as two

populations that have identical means

but in the population bag example the

two bags can have identical means so in

an in an example like this it is

possible to make a type 1 error it's

possible to reject the null

when the null is true because the null

can be true so how many type 1 errors

did we make here well we made 2 out of

16 so this is our alpha roughly in this

example so here it's possible to make

this kind of error rejecting the null

when the null is true in the real world

though if the null hypothesis is never

strictly speaking or explicitly true

then it's not possible to make a type 1

error you cannot reject the null when

the null is true because the null is

never perfectly true so the only kind of

error that you can make in the real

world is a type 2 error and that's why

in the real world it's so important to

manage type 2 errors not type 1 errors

okay so that's the answer to question

number 25 make up your own example and

try to explain it in your own way ok

well I hope that video helped and we'll

see you in the next video

## Browse More Related Video

Errors and Power in Hypothesis Testing | Statistics Tutorial #16 | MarinStatsLectures

Stats: Hypothesis Testing (P-value Method)

Hypothesis Testing and The Null Hypothesis, Clearly Explained!!!

Masih berpikir bahwa hipotesis statistik itu membingungkan? | Statistika

Statistical Significance and p-Values Explained Intuitively

Hypothesis Testing: Calculations and Interpretations| Statistics Tutorial #13 | MarinStatsLectures

5.0 / 5 (0 votes)