The Population Bag Example
Summary
TLDRThis video script delves into a hypothetical 'population bag' example to illustrate the concepts of Type 1 and Type 2 errors in statistical hypothesis testing. It explains the scenario of sampling from two bags, one with and one without removed chips, to demonstrate how researchers might incorrectly reject or fail to reject a null hypothesis. The script clarifies why in real-world applications, managing Type 2 errors is more critical than Type 1 errors, as the null hypothesis of identical population means is never exactly true, unlike in the classroom example where it can be.
Takeaways
- 📚 The video discusses the 'population bag example', which is a hypothetical scenario used to illustrate concepts in hypothesis testing.
- 🎓 The example involves two population bags, each containing chips with numbers from one to five, and the task is to determine if the bags are the same or different based on samples.
- 🔍 The video introduces a null hypothesis that the two bags have the same mean, and an alternative hypothesis that they do not.
- 🧐 The decision rule is set with an alpha level of 0.1, meaning the null hypothesis will be rejected if the observed sample mean difference is unlikely (p < 0.1).
- 🔢 The video explains the process of taking samples from each bag, calculating the mean difference, and then determining a T value and p-value to test the hypothesis.
- 🔄 The process is repeated 16 times to simulate multiple experiments, highlighting the variability in outcomes and the potential for errors in hypothesis testing.
- 🚫 Type 1 errors occur when the null hypothesis is incorrectly rejected (false positive), and Type 2 errors occur when the null hypothesis is incorrectly accepted (false negative).
- 🤔 The video emphasizes that in real-world scenarios, the null hypothesis is never exactly true, making Type 1 errors impossible and Type 2 errors more critical to manage.
- 🌐 The 'population bag example' contrasts with real-world situations where the effect size (difference between population means) is often small, increasing the likelihood of Type 2 errors.
- 📉 The video concludes by answering a specific question about why beta (Type 2 error rate) is more important than alpha (Type 1 error rate) in real-world applications, highlighting the practical implications of hypothesis testing.
Q & A
What is the 'population bag example' described in the video?
-The 'population bag example' is a hypothetical scenario used to illustrate the concept of hypothesis testing. It involves two bags, each containing chips with numbers on them. The bags can either have the same chips or one bag has some chips removed, altering the mean value. The task is to determine, through sampling and statistical analysis, whether the two bags have the same or different chips.
What is the significance of the mean and standard deviation in the population bag example?
-In the population bag example, the mean and standard deviation of the chips in the bags are crucial. The mean represents the average value of the chips, and the standard deviation measures the spread of the values. These statistical measures help in understanding the distribution of the chips and are used in hypothesis testing to determine if the samples are drawn from bags with the same or different chips.
What are the two situations described in the population bag example?
-The two situations in the population bag example are: 1) Both bags have the same chips (e.g., six ones, twos, threes, fours, and fives), and 2) One bag has the same chips as the population bag, while the other has some chips removed (e.g., three ones, six twos, six threes, six fours, and six fives), resulting in a different mean.
Why is it important to calculate the mean difference between the samples in the population bag example?
-Calculating the mean difference between the samples is important because it helps in making a judgment about whether chips have been removed from the bag. If the two bags have the same chips, the means of the samples should be similar. However, if chips have been removed, the mean of the bag with removed chips will be higher, and the sample means will likely differ.
What is the null hypothesis in the hypothesis testing procedure described in the video?
-The null hypothesis in the hypothesis testing procedure is that the two bags are exactly the same, meaning the mean of the chips in bag one is the same as the mean of the chips in bag two. This assumption of no difference is what is being tested against the alternative hypothesis.
What is the decision rule used in the hypothesis testing procedure, and what is its significance?
-The decision rule used in the hypothesis testing procedure is that if the p-value is less than alpha (set at 0.1 in the example), the null hypothesis is rejected. This rule is significant because it determines whether the observed sample mean difference is statistically significant enough to conclude that the bags are different.
What are type 1 and type 2 errors in the context of hypothesis testing?
-In hypothesis testing, a type 1 error occurs when the null hypothesis is rejected when it is actually true. A type 2 error occurs when the null hypothesis is not rejected when it is actually false. In the population bag example, type 1 errors would be incorrectly concluding that the bags are different when they are the same, while type 2 errors would be incorrectly concluding that the bags are the same when they are different.
Why is beta more important than alpha in real-world situations but not in the population bag example?
-In real-world situations, the means of two populations are never exactly the same, so the null hypothesis is never strictly true. This means that the only kind of error that can be made is a type 2 error (failing to reject a false null hypothesis). Therefore, managing the probability of a type 2 error (beta) is more important. In the population bag example, however, the null hypothesis can be true, making both type 1 and type 2 errors possible.
What is the effect of sample size on the probability of making a type 2 error?
-The sample size has a significant impact on the probability of making a type 2 error. A smaller sample size reduces the power of the test, making it more likely to fail to detect a true effect (i.e., make a type 2 error). In the population bag example, the small sample size of 10 chips from each bag contributes to the high probability of type 2 errors.
How does the effect size influence the probability of making a type 2 error in the population bag example?
-The effect size, which is the difference between the population means, plays a crucial role in the probability of making a type 2 error. If the effect size is small, the test is less likely to detect a difference even if one exists, increasing the likelihood of a type 2 error. In the population bag example, the small difference in means between the bags with and without removed chips contributes to the high probability of type 2 errors.
Outlines
🎒 Introduction to the Population Bag Example
The speaker introduces a hypothetical scenario known as the population bag example, which is relevant to a short answer question (number 25) in an upcoming quiz. The scenario involves two bags, each potentially containing the same set of chips, with one bag possibly having some chips removed. The purpose is to determine whether two samples are from bags with identical chips or not. The example uses hypothesis testing to decide if the mean difference between two samples suggests that chips have been removed from one bag, affecting its mean value. The video will guide through the process of sampling, calculating means, and using those to infer about the population characteristics.
🔍 Hypothesis Testing and Decision Making
This paragraph delves into the specifics of the hypothesis testing process. The null hypothesis posits that the two bags are identical in terms of the chips they contain. An alpha level of 0.1 is set as the decision rule for rejecting the null hypothesis, based on the probability (p-value) associated with the observed sample mean difference. The scenario of conducting 16 separate experiments is introduced, where in most cases, the p-value does not fall below the alpha threshold, indicating no significant difference between the bags' means. However, two instances of type 1 errors are identified, where the null hypothesis is wrongly rejected despite the bags being identical, aligning with the expected error rate of 10%.
📈 Understanding Type 1 and Type 2 Errors in Different Scenarios
The speaker explores the implications of type 1 and type 2 errors in the context of the population bag example and contrasts it with real-world scenarios. In the bag example, the null hypothesis can be true, allowing for the possibility of type 1 errors. However, in real-world applications, the null hypothesis of no difference is usually an idealization and not strictly true, making type 1 errors a theoretical concern rather than a practical one. The focus then shifts to type 2 errors, which occur when the null hypothesis is not rejected despite it being false. The speaker illustrates this with a scenario where the bags have different means, and the sampling procedure fails to detect this difference, leading to multiple type 2 errors and a high error rate.
🌐 Real-World Significance of Beta Over Alpha
The final paragraph addresses question number 25, explaining why beta (the probability of a type 2 error) is more critical in real-world applications compared to alpha. The speaker clarifies that in real-world scenarios, it is impossible for two populations to have identical means, making the null hypothesis of no difference between means always false to some extent. As a result, the potential for type 1 errors is minimal, whereas the risk of type 2 errors—failing to detect a real difference—is substantial. The video concludes by emphasizing the importance of managing type 2 errors in practical research and decision-making.
Mindmap
Keywords
💡Population Bag Example
💡Hypothesis Testing
💡Type I Error
💡Type II Error
💡Alpha Level
💡Beta
💡Sample Mean Difference
💡T Value
💡P-Value
💡Effect Size
💡Null Hypothesis
Highlights
Introduction to the 'population bag' example used to explain statistical concepts in hypothesis testing.
Reference to short answer question number 25, which asks why beta is more important than alpha in the real world but not in the population bag example.
Description of two hypothetical situations involving two population bags with either identical or different chips.
Explanation of the task for researchers to determine if samples are from bags with the same or different chips based on mean differences.
Hypothesis testing procedure using the logic of rejecting the null hypothesis if the p-value is less than a set alpha level.
Sampling 10 values from each bag and calculating the sample mean difference and associated T value.
Decision rule where a p-value less than alpha (0.1) leads to the rejection of the null hypothesis.
Illustration of conducting 16 separate experiments to understand the occurrence of type 1 and type 2 errors.
Example of not rejecting the null hypothesis when it is true, which is not an error in hypothesis testing.
Demonstration of type 1 errors where the null hypothesis is incorrectly rejected despite being true.
Calculation of the probability of making a type 1 error, which is expected to be around the alpha level set.
Explanation of type 2 errors where the null hypothesis is not rejected even though it is false.
Discussion on the high rate of type 2 errors due to small effect size and sample size.
Clarification on why beta (type 2 error) is more critical in real-world scenarios where the null hypothesis is never exactly true.
Comparison between the population bag example and real-world situations regarding the plausibility of the null hypothesis being true.
Final summary explaining the importance of managing type 2 errors in practical applications of hypothesis testing.
Transcripts
in this video I'm going to describe what
I call the population bag example now
this example is made reference to in the
short answer questions they're going to
be eligible for the last quiz and the
particular short answer question is
number 25 and in that question I asked
why is bata more important than alpha in
the real world but not in the population
bag example that we did in class and so
this population bag example that's what
I'm going through in the video ok now I
want you to imagine a case here in which
there are two population bags now there
are two situations here with these two
bags in one situation the two bags could
have exactly the same chips in them so
in one of the situation's you've got one
bag has 6-1 6-2 6-3 6-4 and six fives
that's your population bag that has a
mean of three and a standard deviation
of one point four one four and then the
other bag in the pair here has exactly
the same chips in it as the first bag so
so one situation is you have two
population bags both population bags
have the same chips in them and those
chips are the chips that you have in
your actual population bag now six ones
twos threes fours and fives now the
other situation is that one of the bags
is the same as your population bag now
six ones twos threes fours and fives and
the other bag is like this the other bag
has had chips removed so there aren't
six ones twos threes fours and fives
anymore
there's three ones six two six three six
four and six fives so in this bag there
have been some chips removed and what
has been removed three ones three low
chips okay so you've got these two
situations situation one the two bags
are the same and situation two you have
one bag with no chips removed and one
bag with chips removed now the problem
for you as the researcher in this
situation is to guess which situation
you have
so you have to guess whether or not the
two samples that you've drawn have been
drawn from bags with the same chips in
them or whether or not the two samples
have been drawn from bags with different
chips in them okay so that's your job as
a researcher so what's going to happen
here is we're going to take a sample out
of each of two bags and so you're gonna
have two samples of chips we're gonna
calculate the mean of each of the
samples and we're going to use the mean
difference between the samples to make a
judgment about whether or not chips have
been removed from the bag now if you're
sampling from two bags that are the same
you should expect the means of the two
samples that you get to be pretty
similar that's because you're sampling
from bags that have the same chips in
them but if you're sampling from one bag
in which no chips have been removed
there's six ones two threes fours and
fives and you're sampling from another
bag in which chips have been removed you
would expect the sample mean in the bag
where chips have been removed to be a
little different from the sample mean in
the bag where chips have not been
removed now specifically what would you
expect well if you've taken out three
low chips here the mean of the chips in
the bag is going to be a little higher
it's three point two because you've
removed three chips with a value of one
you've removed low numbers so the mean
is going to go up so you would expect
the means of the samples drawn from a
bag with a mean of three point two to be
on average a little bit higher than the
means of the samples drawn from a bag
with no chips removed from the bag with
a mean of three so what we're going to
do here is we're going to go through a
hypothesis testing procedure we're going
to use the logic bypasses testing to
make a decision about whether or not
these two samples that you're going to
draw have been drawn from bags are the
same or drawn from bags that are
different all right so here's the
hypothesis testing logic here you can
see that we start with a null hypothesis
and so your null hypothesis is going to
be that the two bags are exactly the
same okay so that the me
on the chips in bag one it's the same as
the mean on the chips in bag two so the
mean is three here and the mean is three
in your second bag okay so you're going
to start with the null assumption of no
difference between the bags now we've
got a decision rule here and the
decision rule is if P less than alpha
and we're gonna set alpha 0.1 we're
going to reject the null hypothesis so
if the probability of the sample mean
difference we observe is small if the
null hypothesis is true we're going to
conclude that the null hypothesis is not
true now the study we're going to do is
to sample 10 values from each bag so
we're gonna sample 10 values from one
bag and 10 values from the other bag
we're going to calculate the sample mean
difference so the difference between the
means of these two samples and then
we're going to calculate a T value now
this is the T value for an independent
groups t-test that we saw in a previous
video and we're going to use that T
value to calculate the probability of
observing the sample mean difference
that we observed given that the two
samples have been drawn from bags with
exactly the same means so given that the
two samples have been drawn from these
two bags here now we're then going to
make a decision and the decision that
we're going to make is that if the p
value is less than our alpha of 0.1
we're going to reject the null and we're
going to conclude that we've sampled
from bags with different means all right
now what I want you to imagine in this
situation is that we're not just doing
this sampling procedure once let's
imagine that we do it 16 times so we do
16 separate experiments here okay and
let's imagine we get a result like this
okay so imagine there are 16 experiments
16 cases in which we draw samples of
size 10 from each of the two bags and
imagine that these are cases in which no
chips have been removed from these two
bags okay so let's imagine we draw ten
chips out of this bag we draw ten chips
out of this bag we calculate the mean of
the numbers on the chips in this bag and
the mean of the No
the chips in this bag and we find the
difference between the two sample means
that's x-bar 1 minus x-bar to imagine
that mean difference is 0.15 we then
calculate a T value and get the
Associated p-value the probability of
this sample mean difference or something
bigger
given that the two samples have been
drawn from bags with equal means which
in this case is true because we are
actually sampling from two bags with the
same mean now let's imagine that in the
first case that p-value is 0.82 so that
means that there's a point 8 2 or 82
percent chance of getting a sample mean
of 0.15 or bigger if we've sampled from
bags with the same means now that
probability is very high so the
probability of what we've observed is
pretty high if the null is true
therefore we don't reject the null we've
observed something likely if the null is
true so why would we reject the null so
we don't reject the null now question is
is that an error well in fact it's not
an error we have sample from bags with
the same means so the null is actually
true in this case we have not rejected
the null so we've not made an error
okay now let's imagine we do this whole
procedure again so we throw the chips
back in the bag and we sample another 10
chips from this bag and another ten
chips from this bag we calculate the
sample mean difference let's imagine
it's minus point to calculate a p-value
make our decision P is not less than our
alpha point 1 so we do not reject the
null and we have not made an error so
here I've done this procedure 16 times
and if we go down the list we see that
there are only two cases here in which
the p-value that we calculated the
probability of the sample mean
difference we observed is less than 0.1
that's this case here and this case here
now in this case the sample mean
difference was 0.8 3 the difference
between the sample means was relatively
speaking quite large and if you're
sampling from bags that have the same
means you wouldn't expect to get large
sample mean differences it's unlikely
to get large sample mean differences so
we've got a sample mean difference of
0.8 3 here what's the probability of
that 0.09 it's it's getting on to
getting to the low side so because we've
set an alpha point 1 we reject the null
hypothesis that we have sampled from
bags that have the same means but of
course we're wrong because we have
concluded that we've not sampled from
bags with the same means and we actually
have so that's an error now this kind of
error is a type 1 error we have rejected
the null hypothesis but the null
hypothesis is actually true we sampled
from bags with the same mean so that's a
type 1 error there and if you look down
we make another type 1 error here see
here we got a pretty large sample mean
difference we wouldn't expect that if
we're sampling from bags with the same
means and the chance of that is pretty
low and so because it's less than our
alpha point no one reject the null and
we've made another type 1 error so if
you look down the list here you see that
we've made two type 1 errors now the
probability of making a type 1 error is
0.1 meaning we would make an a type 1
error about 10 percent of the time
now we've done 16 experiments here we've
made 2 type 1 errors that's you know
it's not 10 percent but it's around 10
percent of the time that we've made a
type 1 error so that result is what we
would sort of expect here we've made
type 1 errors about as often as we would
expect to make them in this situation ok
now let's imagine that we're in this
second circum circumstance let's imagine
that we are sampling from this bag one
of the samples comes from this bag and
the other sample comes from this bag so
now we are not sampling from populations
that have the same means now let's
imagine that we went through 16
experiments again so we did the sampling
procedure 16 times and got these results
okay so in the first case take 10 values
from here take 10 values from here
calculate the mean of the 10 values that
came from this population the mean of
the 10 values that came from this
population calculate the mean difference
imagine that the mean difference is 0.4
3 now we calculate the p-value the
probability of observing this sample
mean
difference if we'd sample from
populations with the same means so this
p-value that we've calculated here is
the probability of seeing a sample mean
difference of 0.4 3 had we drawn our
samples from these two bags
now that's 0.65 that's not less than
alpha of 0.1 so we do not reject the
null hypothesis now in this case that's
an error we have not concluded that the
null is false but the null is actually
false because one of the samples came
from a bag with a mean that is different
than the other population so mu bag 1 is
not equal to MU bag 2 in this case and
so the null is false we haven't rejected
the null and so we have made an error
now this kind of error is an error in
which you do not reject the null and the
null is false that's a type 2 error so
we've made a type 2 error here now if
you go down to the next mean difference
is 0.68 the p-value is 0.45 we do not
reject the null again an error now here
we got a mean difference of 0.92 so
that's a fairly large mean difference
the probability of getting a mean
difference that large or larger if we
sample from bags with the same means as
point zero seven so it's unlikely to see
a mean difference that big if we've
sampled from bags with the same means
so we conclude we have not sampled from
bags from the same means we reject the
null hypothesis now is that an error no
it's not an error we've rejected the
null and the null is false now if you go
down the list here you'll see that we
have only been correct three times out
of 16 we've been incorrect 13 times out
of 16 so the rate of error is 13 out of
16 in this particular example that is
analogous to beta so this is the
probability here that we do not reject
the null when the null is false so the
problem is that we've made a lot of type
2 errors and so the question is well why
is that it doesn't seem that the
decision procedure we're using is a very
good decision
procedure if it leads to so many errors
okay well part of the reason for this is
that the mean in the bag in which chips
have been removed is not very different
from the mean in the bag where chips had
not been removed the effect size the
difference between the population means
is not very big and when the effect size
is not very big beta is big another
reason is that the sample size is small
we only drew 10 values out of this bag
and 10 values out of this bag so we have
a combination of a small effect size and
small sample size and that's what's
causing us to have a large probability
of a type 2 error here okay now to
answer question number 25 then let's
just take a look at it again
why is beta more important than alpha in
the real world but not in the population
bag example we did in class we just need
to understand the difference between
this situation and a real-world
situation now in this situation the null
hypothesis can actually be true it can
be the case that the means of the
populations are exactly the same if you
have six ones twos threes fours and
fives here and six ones twos threes
fours and fives here then the mean is
exactly three in both cases so this can
be true the two means can be exactly the
same in this example but in the real
world the means of two populations are
never exactly the same let's imagine we
have the null hypothesis that the mean
of population 1 minus the mean of
population 2 is 0 now we mean here
exactly zero so zero point zero zero
zero all the way out this is what's
called a point hypothesis and a point
type ah this is a specific exact
mathematical statement of the difference
between two parameters so here the
statement is that the two parameters are
exactly the same now if you imagine a
real-world case that we would be
imagining here let's say we imagine
population one is taking some kind of a
placebo and we imagine popular
- is taking some kind of a drug now the
hypothesis here is that even though
everybody in this population took a
placebo and everybody in this population
took a drug their scores on some measure
like for instance a depression measure
are exactly the same they're not
different by point zero zero zero zero
one they're exactly the same now of
course this drug has an active
ingredient in it this does not have an
active ingredient in it so these two
populations are being treated
differently so they may have a mean
that's very similar to each other but
they will not be exactly the same and
that is because they have been treated
differently now this is analogous to the
idea we spoke about in class a long time
ago
which was the idea that a coin could be
perfectly akley weighted on both sides
so recall that we had a probability
distribution like this with 0.5 here and
PX here and the issue was whether or not
this could describe a real coin or
whether or not this is an idealization
of the coin this is a model for a coin
but couldn't actually perfectly
represent a real coin now recall that I
said that there is no real coin in the
real world that is perfectly equally
weighted on both sides so it cannot be
that a null hypothesis like this that
the probability of a head let's say
equals 0.5 it cannot be that a point
hypothesis like that is actually true in
the real world because it can't be the
case that there's such a thing as a real
coin that is perfectly equally weighted
on both sides this is an idealization of
a coin just like this is an idealization
of the difference between two
populations okay so in the real world
there is no such thing as two
populations that have identical means
but in the population bag example the
two bags can have identical means so in
an in an example like this it is
possible to make a type 1 error it's
possible to reject the null
when the null is true because the null
can be true so how many type 1 errors
did we make here well we made 2 out of
16 so this is our alpha roughly in this
example so here it's possible to make
this kind of error rejecting the null
when the null is true in the real world
though if the null hypothesis is never
strictly speaking or explicitly true
then it's not possible to make a type 1
error you cannot reject the null when
the null is true because the null is
never perfectly true so the only kind of
error that you can make in the real
world is a type 2 error and that's why
in the real world it's so important to
manage type 2 errors not type 1 errors
okay so that's the answer to question
number 25 make up your own example and
try to explain it in your own way ok
well I hope that video helped and we'll
see you in the next video
Browse More Related Video
Errors and Power in Hypothesis Testing | Statistics Tutorial #16 | MarinStatsLectures
Stats: Hypothesis Testing (P-value Method)
Hypothesis Testing and The Null Hypothesis, Clearly Explained!!!
Masih berpikir bahwa hipotesis statistik itu membingungkan? | Statistika
Statistical Significance and p-Values Explained Intuitively
Hypothesis Testing: Calculations and Interpretations| Statistics Tutorial #13 | MarinStatsLectures
5.0 / 5 (0 votes)