Mean and variance of Bernoulli distribution example | Probability and Statistics | Khan Academy
Summary
TLDRThe video discusses a simple example of a Bernoulli distribution, where a population is surveyed for their opinion on the president. Respondents can either give a favorable or unfavorable rating, and the mean and variance of this discrete probability distribution are calculated. The example demonstrates how to find the expected value and variance using probability-weighted sums, despite the expected value not being a possible outcome. It concludes by introducing general formulas for the mean, variance, and standard deviation in a Bernoulli distribution, setting the stage for further exploration of the binomial distribution.
Takeaways
- 📊 The speaker conducts a full survey of a population's opinion on the president, offering two possible responses: favorable or unfavorable.
- 🎯 The probability distribution is discrete with two outcomes: 40% have an unfavorable view, and 60% have a favorable view.
- 📈 The expected value (mean) of the distribution is calculated by assigning 0 to unfavorable (u) and 1 to favorable (f) views.
- 🔢 The expected value of the distribution is 0.6, which represents a probability-weighted sum of the two options.
- 🙅 No individual can have an actual value of 0.6; individuals will either have a favorable or unfavorable rating (0 or 1).
- 💡 The variance of the population is the probability-weighted sum of the squared distances from the mean.
- 🔍 Variance is calculated using the differences between each outcome (0 or 1) and the mean (0.6), resulting in a variance of 0.24.
- 📐 The standard deviation is the square root of the variance, which in this case is approximately 0.49.
- 🧠 The distribution is skewed to the right, with most individuals having a favorable view.
- 📚 This specific example introduces the Bernoulli Distribution, a special case of the binomial distribution, which is further explained in future discussions.
Q & A
What is the purpose of surveying every single member of a population in this scenario?
-The purpose is to gather data on the favorability rating of the president, with the aim of understanding the distribution of opinions within the population.
What are the two options available for the survey respondents?
-The respondents can either have an unfavorable rating or a favorable rating for the president.
What percentage of the population had an unfavorable rating according to the survey?
-According to the survey, 40% of the population had an unfavorable rating.
What percentage of the population had a favorable rating?
-60% of the population had a favorable rating.
How is the probability distribution represented in this scenario?
-The probability distribution is represented as a discrete distribution with two values: unfavorable (0) and favorable (1).
What is the expected favorability rating of a randomly picked member of the population?
-The expected favorability rating is the mean of the distribution, which is calculated as 0.4 * 0 + 0.6 * 1 = 0.6.
Why is the mean of 0.6 not a value that an individual can actually take on?
-The mean of 0.6 is not a value that an individual can take on because each person must choose either a favorable or unfavorable rating, which are represented as 1 or 0, respectively.
How is the variance of the distribution calculated?
-The variance is calculated as the probability-weighted sum of the squared distances from the mean. In this case, it is 0.4 * (0 - 0.6)^2 + 0.6 * (1 - 0.6)^2 = 0.24.
What is the standard deviation of this distribution?
-The standard deviation is the square root of the variance, which is approximately 0.49.
What does the distribution's skew to the right indicate?
-The skew to the right indicates that the distribution is not symmetric and that there is a higher concentration of favorable ratings.
What is the Bernoulli Distribution mentioned in the script?
-The Bernoulli Distribution is a discrete probability distribution that takes value 1 with success probability p and value 0 with failure probability q = 1 - p. It is the simplest case of the binomial distribution.
Outlines
📊 Understanding Population Favorability Ratings
In this paragraph, the speaker explains a scenario where every member of a population is surveyed about their opinion of the president, with only two response options: favorable or unfavorable. After surveying the entire population, 40% give an unfavorable rating, while 60% give a favorable rating. This forms a discrete probability distribution since only two values are possible. The speaker introduces the concept of expected value (mean) and explains how it’s calculated for this distribution, assigning 0 to unfavorable and 1 to favorable. The mean is found to be 0.6, but no individual has this rating; rather, it's an average across the population. This discrepancy between the actual ratings (0 or 1) and the mean (0.6) is explored.
🧮 Calculating Variance and Standard Deviation in Discrete Distributions
This paragraph introduces variance, which measures how much the data points in the distribution deviate from the mean. Using the same population distribution (40% unfavorable, 60% favorable), the speaker shows how variance is calculated as the probability-weighted sum of squared differences from the mean. The variance is calculated as 0.24, and the standard deviation (the square root of the variance) is 0.49. The speaker also notes that while it's harder to visualize standard deviation in a discrete distribution, it makes sense that the distribution is skewed to the right since more people gave a favorable rating.
Mindmap
Keywords
💡Probability Distribution
💡Discrete Distribution
💡Mean (Expected Value)
💡Variance
💡Standard Deviation
💡Bernoulli Distribution
💡Binomial Distribution
💡Weighted Sum
💡Success and Failure
💡Squared Distance
Highlights
Surveying an entire population to measure the favorability rating of the president is typically impractical, but hypothetically possible.
The population has two response options: favorable or unfavorable rating.
In this example, 40% of the population has an unfavorable rating, and 60% has a favorable rating.
The probability distribution in this scenario is discrete, with only two possible values.
The mean (or expected value) is calculated as the probability-weighted sum of the possible values of the distribution.
Defining the unfavorable rating as 0 and the favorable rating as 1 allows for calculating the mean of the distribution.
The mean of the distribution is calculated as 0.6, representing the expected favorability rating.
Even though the mean is 0.6, no individual can have a favorability value of 0.6; individuals can only choose 1 or 0.
The mean represents the expected proportion of favorable responses in the population, not an individual outcome.
Variance is defined as the probability-weighted sum of the squared distances from the mean.
To calculate variance, the distances between the possible values (0 and 1) and the mean (0.6) are squared and weighted by their respective probabilities.
The variance is calculated to be 0.24.
The standard deviation of the distribution is the square root of the variance, which is approximately 0.49.
The distribution is skewed to the right, with the mean closer to the favorable rating (1).
This scenario demonstrates a basic case of the Bernoulli distribution, which is the simplest form of a binomial distribution.
Transcripts
Let's say that I'm able to go out and survey every single
member of a population, which we know is not normally
practical, but I'm able to do it.
And I ask each of them, what do you think of the president?
And I ask them, and there's only two options, they can
either have an unfavorable rating or they could have a
favorable rating.
And let's say after I survey every single member of this
population, 40% have an unfavorable rating and 60%
have a favorable rating.
So if I were to draw the probability distribution, and
it's going to be a discrete one because there's only two
values that any person can take on.
They could either have an unfavorable view or they could
have a favorable view.
And 40% have an unfavorable view, and let me color code
this a little bit.
So this is the 40% right over here, so 0.4 or maybe I'll
just write 40% right over there.
And then 60% have a favorable view.
Let me color code this.
60% have a favorable view.
And notice these two numbers add up to 100% because
everyone had to pick between these two options.
Now if I were to go and ask you to pick a random member of
that population and say what is the expected favorability
rating of that member, what would it be?
Or another way to think about it is what is the mean of this
distribution?
And for a discrete distribution like this, your
mean or you're expected value is just going to be the
probability weighted sum of the different values that your
distribution can take on.
Now the way I've written it right here, you can't take a
probability weighted sum of u and f-- you can't say 40%
times u plus 60% times f, you won't get
any type of a number.
So what we're going to do is define u and f to be
some type of value.
So let's say that u is 0 and f is 1.
And now the notion of taking a probability weighted sum makes
some sense.
So that mean, or you could say the mean, I'll say the mean of
this distribution it's going to be 0.4-- that's this
probability right here times 0 plus 0.6 times 1, which is
going to be equal to-- this is just going to be
0.6 times 1 is 0.6.
So clearly, no individual can take on the value of 0.6.
No one can tell you I 60% am favorable and 40% am
unfavorable.
Everyone has to pick either favorable or unfavorable.
So you will never actually find someone who has a 0.6
favorability value.
It'll either be a 1 or a 0.
So this is an interesting case where the mean or the expected
value is not a value that the distribution can
actually take on.
It's a value some place over here that
obviously cannot happen.
But this is the mean, this is the expected value.
And the reason why that makes sense is if you surveyed 100
people, you'd multiply 100 times this number, you would
expect 60 people to say yes, or if you'd summed them all
up, 60 would say yes, and then 40 would say 0.
You sum them all up, you would get 60% saying yes, and that's
exactly what our population distribution told us.
Now what is the variance?
What is the variance of this population right over here?
So the variance-- let me write it over here, let me pick a
new color-- the variance is just-- you could view it as
the probability weighted sum of the squared distances from
the mean, or the expected value of the squared distances
from the mean.
So what's that going to be?
Well there's two different values that
anything can take on.
You can either have a 0 or you could either have a 1.
The probability that you get a 0 is 0.4-- so there's a 0.4
probability that you get a 0.
And if you get a 0 what's the distance from 0 to the mean?
The distance from 0 to the mean is 0 minus 0.6, or I can
even say 0.6 minus 0-- same thing because we're going to
square it-- 0 minus 0.6 squared-- remember, the
variance is the weighted sum of the squared distances.
So this is the difference between 0 and the mean.
And then plus, there's a 0.6 chance that you get a 1.
And the difference between 1 and 0.6, 1 and our
mean, 0.6, is that.
And then we are also going to square this over here.
Now what is this value going to be?
This is going to be 0.4 times 0.6 squared-- this is 0.4
times point-- because 0 minus 0.6 is negative 0.6.
If you square it you get positive 0.36.
So this value right here-- I'm going to color code it.
This value right here is times 0.36.
And then this value right here-- let me do this in
another-- so then we're going to have plus 0.6 times 1 minus
0.6 squared.
Now 1 minus 0.6 is 0.4.
0.4 squared is 0.16.
So let me do this.
So this value right here is going to be 0.16.
So let me get my calculator out to actually calculate
these values.
So this is going to be 0.4 times 0.36, plus 0.6 times
0.16, which is equal to 0.24.
So our standard deviation of this distribution is 0.24.
Or if you want to think about the variance of this
distribution is 0.24 and the standard deviation of this
distribution, which is just the square root of this, the
standard deviation of this distribution is going to be
the square root of 0.24, and let's calculate what that is.
That is going to be-- let's take the square root of 0.24,
which is equal to 0.48-- well I'll just round it up-- 0.49.
So this is equal to 0.49.
So if you were look at this distribution, the mean of this
distribution is 0.6.
So 0.6 is the mean.
And the standard deviation is 0.5.
So the standard deviation is-- so it's actually out here--
because if you go add one standard deviation you're
almost getting to 1.1, so this is one standard deviation
above, and then one standard deviation below gets you right
about here.
And that kind of makes sense.
It's hard to kind of have a good intuition for a discrete
distribution because you really can't take on those
values, but it makes sense that the distribution is
skewed to the right over here.
Anyway, I did this example with particular numbers
because I wanted to show you why this
distribution is useful.
In the next video I'll do these with just general
numbers where this is going to be p, where this is the
probability of success and this is 1 minus p, which is
the probability of failure.
And then we'll come up with general formulas for the mean
and variance and standard deviation of this
distribution, which is actually called the Bernoulli
Distribution.
It's the simplest case of the binomial distribution.
تصفح المزيد من مقاطع الفيديو ذات الصلة
Poisson Distribution EXPLAINED in UNDER 15 MINUTES!
Lecture 10.1 - Binomial distribution - Bernoulli distribution
Normal Distribution and Empirical Rule
Lecture 10.2 - Binomial distribution - IID Bernoulli trials
Standard Deviation Formula, Statistics, Variance, Sample and Population Mean
Calculating Power and the Probability of a Type II Error (A One-Tailed Example)
5.0 / 5 (0 votes)