Sampling: Population vs. Sample, Random Sampling, Stratified Sampling
Summary
TLDRThe video script delves into the concept of sampling as a method for researchers to gather data from a population. It explains the difference between a population and a sample, emphasizing the impracticality of surveying every individual within a large population, such as all Americans. The script introduces two primary forms of sampling: non-probability and probability sampling. It focuses on probability sampling, which includes random sampling and stratified sampling, as a more reliable method for ensuring that every member of the population has an equal chance of being included in the study. The video also discusses the importance of a representative sample for drawing accurate inferences about the population. It concludes by illustrating how these sampling methods could be applied to a hypothetical study on the support for marijuana legalization among Americans, suggesting the use of stratified sampling to account for the country's diversity.
Takeaways
- π **Population vs Sample**: The population is the entire set of subjects you wish to study, while a sample is a part or subset of that population.
- π **Representative Sample**: A good sample should be representative of the population in terms of various characteristics like age, gender, and income to allow for accurate inferences.
- βοΈ **Sampling Methods**: There are two main forms of sampling: non-probability sampling (based on convenience or volunteer basis) and probability sampling (where every member has an equal chance to be included).
- π° **Random Sampling**: Involves random selection from the population, ensuring every member has an equal chance to be included in the sample.
- π¦ **Stratified Sampling**: Divides the population into strata (subgroups) based on certain characteristics, then randomly selects from each stratum to ensure representation of each subgroup.
- π **External Validity**: A representative sample leads to high external validity, meaning the findings can be generalized to the larger population.
- π€ **Sample Size Importance**: The size of the sample matters, depending on the size and variability of the population being studied.
- π **Zip Codes for Random Sampling**: As an example, using zip codes can simplify random sampling by reducing a large population to a more manageable number of units.
- 𧩠**Stratified Sampling for Variability**: Especially useful for large and varied populations, ensuring that each subgroup is represented in the sample.
- β **Sampling Bias**: A non-representative sample can lead to sampling bias, resulting in low external validity and potential errors in study conclusions.
- β **Generalization to Population**: The goal of sampling is to make inferences about the population based on the sample, which is more feasible and cost-effective than studying the entire population.
- π **Practical Application**: The concepts of sampling can be applied to real-world research questions, such as determining the percentage of Americans supporting the legalization of marijuana.
Q & A
What is the primary challenge in asking every American about their support for the legalization of marijuana?
-The primary challenge is that it would be too time-consuming and expensive. The United States has over 330 million people, making it nearly impossible to reach every individual.
What is the difference between a population and a sample?
-A population represents the entire set of something that you wish to study, which could be people, objects, or a specific subgroup. A sample is a part or a subset of the population, representing a smaller percentage of the total.
Why is sampling used in research?
-Sampling is used because it is often impractical or impossible to study an entire population. By taking a representative sample, researchers can make inferences about the population based on the sample's characteristics.
What are the two main forms of sampling?
-The two main forms of sampling are non-probability sampling and probability sampling. Non-probability sampling is often based on convenience or volunteer participation, while probability sampling involves random selection, giving every member of the population an equal chance to be included.
How does random sampling ensure that every member of the population has an equal chance to be included in the study?
-Random sampling uses a process akin to picking names out of a hat, where individuals are selected randomly from the population. This can be done using a computer or an Excel sheet that randomizes the selection process.
What is the importance of a sample being representative of the population?
-A representative sample ensures that all characteristics and features of the population, such as different ages, genders, incomes, and backgrounds, are reflected in the sample. This allows for better inferences and conclusions to be drawn about the population from the sample.
What is the concept of external validity in research?
-External validity refers to the ability to generalize the findings of a study to the real world or the general population. A study with high external validity can confidently apply its results to the population at large.
What is sampling bias and how can it affect a study?
-Sampling bias occurs when the sample is not representative of the population. This can lead to low external validity, meaning the results cannot be generalized to the population, potentially leading to errors in the study.
How does the size of the population and its variation influence the sample size needed for a study?
-The size of the population and the amount of variation within it affect the sample size needed. Larger and more varied populations typically require a larger sample size to ensure that the sample is representative and that the study has high external validity.
How can zip codes be used in random sampling to study a large population like the United States?
-Zip codes can be used to randomly select areas for study instead of individuals. By randomizing a list of zip codes, researchers can contact people within those selected areas, making the process more manageable and representative.
What is stratified sampling and when is it preferred over random sampling?
-Stratified sampling involves dividing the population into subgroups, or strata, based on specific characteristics and then randomly selecting from each stratum. It is preferred over random sampling when the population has a lot of variability to ensure that every subgroup is represented in the sample.
How can stratified sampling be applied to determine the percentage of Americans who support the legalization of marijuana?
-Stratified sampling can be applied by dividing the population into strata based on factors like gender, race, income, and educational background. Then, a random selection is made from each stratum to ensure that the sample is representative of the diverse population, allowing for more accurate inferences about the population's views on marijuana legalization.
Outlines
π Understanding Population and Sample Basics
The first paragraph introduces the challenge of determining the percentage of Americans who support the legalization of marijuana. It explains the impracticality of surveying every American and introduces the concept of sampling as a solution. The difference between a population, which is the entire set of subjects being studied, and a sample, which is a subset of the population, is clarified. The importance of having a representative sample is emphasized to make accurate inferences about the population.
π― Probability Sampling: Random and Stratified
This paragraph delves into the types of probability sampling, which includes random sampling and stratified sampling. Random sampling is based on random selection, ensuring every member of the population has an equal chance to be included in the study. Stratified sampling, on the other hand, involves dividing the population into subgroups or 'strata' based on specific characteristics before conducting random selection from each stratum. The goal is to account for individual differences and make better inferences from the sample to the population.
π Applying Random Sampling to Surveys
The third paragraph discusses the practical application of random sampling, suggesting the use of zip codes to randomly select respondents across the United States. This method is more manageable than surveying 200 million people directly and ensures that every part of the country has an equal chance of being represented in the study. The paragraph also touches on the importance of sample size in relation to the size of the population and the amount of variation within it.
π Stratified Sampling for Varied Populations
The final paragraph focuses on stratified sampling as a method particularly useful for populations with significant variability. It illustrates how to divide the population into strata based on factors like gender, race, income, and educational background to ensure that each subgroup is represented in the sample. This approach is recommended for answering the question about marijuana legalization support, as it would account for the diverse demographics in America.
Mindmap
Keywords
π‘Population
π‘Sample
π‘Sampling
π‘Random Sampling
π‘Stratified Sampling
π‘Non-probability Sampling
π‘Representation
π‘External Validity
π‘Sampling Bias
π‘Zip Codes
π‘Generalization
Highlights
The challenge of surveying every American on the legalization of marijuana is addressed by using sampling methods.
Population is defined as the entire set of something that you wish to study, which can vary in size and composition.
A sample is a part or subset of the population, denoted by a lowercase 'n'.
Sampling is suggested as a practical alternative to surveying the entire population due to feasibility issues.
Different types of sampling include non-probability and probability sampling, each with its own methodology and use cases.
Random sampling involves random selection, giving every member of the population an equal chance to be included in the study.
Stratified sampling divides the population into subgroups or strata before random selection to ensure representation of each group.
The importance of sample representativeness for accurate inferences about the population is emphasized.
Sampling bias occurs when the sample is not representative of the population, leading to low external validity.
The size of the sample depends on the size of the population and the amount of variation within it.
Using zip codes can be an efficient way to conduct random sampling for a large population like the United States.
Stratified sampling is particularly useful for populations with a lot of variability to ensure every subgroup is represented.
The video provides an example of how to apply stratified sampling to the question of American support for marijuana legalization.
The concept of external validity is introduced as the ability to generalize findings to the real world or the general population.
The video discusses the practicality of conducting research on a large scale, such as a country, using sampling techniques.
The role of technology in facilitating random selection and ensuring equal chances for participation in a study is highlighted.
The video emphasizes the importance of ethical considerations, such as obtaining consent, especially when dealing with sensitive topics.
An interactive example problem is provided in the comments section for viewers to test their understanding of sampling concepts.
Transcripts
i want you to take on the role of a
researcher and you've been put in charge
of answering the question
what percentage of americans support the
legalization of marijuana
but you quickly run into a problem how
could you possibly ask
every american this question it would be
too time consuming it'd be too expensive
just imagine trying to track down over
330 million people it's near
impossible so what do you do a colleague
suggests
to engage in sampling but what is
sampling
how does it work what are the different
types of sampling
well that's what we'll talk about today
so stick around
so let's start by breaking down two
fundamental concepts
what is the difference between a
population and a sample
now let's start with population the
population represents the entire
set of something that you wish to study
and the reason we use the word something
is because the population could be many
different things
your population for example might be all
the people who live in a specific city
it could also be a specific subgroup
your population could be all men
or all women or all newborn babies
and your population can also be specific
things or objects
maybe your population is all the cars on
the road and you want to know
you know how many of them are electric
so the population is the entirety
of something now another way to look at
a population
is the fact that can vary in size
you can have a giant population right it
could be the entire size of a country
example united states right that's
gigantic
to as small as a nursing home right
maybe you want to
do a research study on aging and you
visit a nursing home that's your
population
to a small as a classroom right maybe a
classroom of
50 preschoolers so the population can
vary in many different ways
subgroups things but also vary in
size also note that the population
is represented by the letter n the
capital letter n
so you ever see capital letters equals
and a number you know they're referring
to the
population so then it becomes what is
our sample
the sample is a part of the
population or it is a subset
of a population okay so like a smaller
percentage
of the population instead of the capital
letter n
you might denote a sample with a
lowercase n
okay so we have capital for a population
and a sample would be a lowercase n
now you might be thinking why have a
sample in the first place
right if i'm handing out a survey why
not just give it to everybody in my
population
well if it's a classroom that's pretty
manageable but if i'm trying to hand out
a survey to
330 million people it's difficult to
almost
near impossible so because of that we
take a small
percentage of that population and that
becomes our sample
or the specific number of people in our
sample become the sample size
and if you have a really good sample in
other words if the sample is
representative
of the population in terms of age and
educational background and income
and we'll dive into representation in a
moment you can draw
inferences about the population right so
i can draw inferences meaning draw
conclusions
for instance if i have a really good
sample
i don't necessarily need to ask
everybody right i only need
a small percentage so let's come back to
our question
what percentage of americans support the
legalization of marijuana what would be
our population
now i bet a lot of you are thinking well
it's all americans
but are we sure can you ask babies that
question
can you ask toddlers that question so
when you think about it that way our
population is actually not the entire
united states
we're going to see our population as
everybody over the age of 18
because they can give consent and they
can actually answer our questions
so for our purposes remember uppercase n
we're going to say our population
is everybody over the age of 18. and
that is roughly
200 million americans i feel like doing
awesome powers 200 million americans
and our lower case end right our sample
we're going to say is about 2 000 people
okay and i'm kind of just making that up
but
you might be thinking you can have 2 000
people and
generalize your results to an entire
country well you can if it's a good
sample
and what good means we will cover in the
next few moments
okay so now that we have our population
for sample what's next i want us to
understand there are various ways
you can do sampling so let me give an
example
sampling as i've written up here comes
in two forms we have one called
non-probability sampling
probability and we also have
probability sampling non-probability
our top one is often based on the idea
of convenience
right who's around me right who are the
people close to me to make this easier
you're close to me
all right you could be my study this is
why we call this convenience sampling
or maybe it's volunteer i want to be in
your study okay well that's easy
but what we really want to focus on is
probability sampling
because this is based on chance and the
reason this is important
is every member of a population has an
equal chance
to be in your study and that way we can
account for individual differences among
groups
age gender race and everybody is
accounted for
you can make better inferences from the
sample to the population
so for this video we're going to focus
on two types of probability sampling
they are called random sampling and
stratified sampling all right so what's
the difference let's start with the
first one
so random sampling begins with a
population and for a population let's
just imagine
there are one two three four
five six seven people in our population
well how do we get those people in our
sample what we do is a process called
and this is extremely important random
selection okay random selection is the
root
of probability sampling and essentially
it's like picking names out of hat
right you're in my study or not on my
study you're in my study or not in my
study right there's picking out of hats
it ensures that everybody in my
population has an equal chance
to be in my sample now typically it's
not putting names in the hat
typically you'll put names in a computer
or an excel sheet
and you just randomize them and it just
randomly picks people
right so i put these names in a computer
and it randomly picks them
and my sample becomes participant three
participant
five and participant six right totally
random
okay so there's our population and that
becomes our sample
remember that random sampling is based
on random selection
now let's focus on the sample the key to
a good sample
is that it is representative of the
population
so what does that mean it means that all
the characteristics and
features of the population right
different ages different genders
uh different incomes different
backgrounds are represented
in the sample so for example if 50
of my population are women what
percentage of my sample should also be
women
50 that makes it representative so we
can essentially have two types of
samples
we can have a representative sample
representative
sample and the reason this is important
as we talked about before is you're able
to generalize those results right make
go back
to the population essentially we are
labeling this
as high
external external validity
okay external validity refers the idea
of being able to
generalize your findings to the real
world all right to the general
population
so if you have a good representative
sample which is done through random
selection
you'll have high external validity but
what if you don't have a good sample
right what if your sample isn't
representative of the population
right let's say 50 of your population is
women but your sample only has 10
percent women
well that's going to be a biased sample
okay
or we can label it as sampling bias
okay a bi-sample or sampling bias
and because of this instead of having
high external validity which is key to a
good study
this is going to result in low external
validity
okay in other words we are not validity
going to be able to generalize
our results to the population and this
is going to be allowed to lead to a lot
of errors
in our study so there's random sampling
now one question i always get from my
students is does it matter how many
people are in your sample
yeah it does i mean there's really two
big things to think about
how big is your population and how much
variation is in it right
if you're studying a classroom you might
not need a big sample
but if you're studying an entire country
then yeah you need a little bigger of a
sample size
and also it's due to the amount of
variation
in that population right if you're a
psychological researcher you're studying
rats
well there's not much differences
between rats i mean a rat is a wrath
they're all the same
i know i'm going to get hate mail about
people who own rats but people are very
different
right we have different ages weights
heights religions backgrounds
everything is so different so because of
those differences
you'll need a larger sample size so how
can we use random sampling
in our question what percentage of
americans support the legalization of
marijuana
well you could technically get everybody
a number write 200 million people
a number and you put them in a generator
some sort of excel sheet computer and
you press a button and it randomizes
them you could do that but that would be
pretty hard
but what's a little easier is what if we
did zip codes
right what if we identify all the zip
codes
in america and by the way i looked this
up there
are over 40 000 zip codes so instead of
200 million we got 40 000 zip codes
you put them in a computer you press
randomize and it spits out let's say
3 000 zip codes okay and from those zip
codes
you're able to gather information about
the people who live within that zip code
and you're able to send out you know
mail to them and email them and call
them or do door to door are you able to
contact them
and that way you're not doing everybody
but you're essentially
you know going to different parts right
each one of these green dots represents
a zip code
so we're not going door to door in every
single town in america
we're doing it so we can space it out
everybody gets uh
equal chance in america to be in our
study and it's easier it's less time
consuming
we'll make phone calls we'll send out
telegrams we'll do we'll do mail
all those kind of things no forget
hawaii don't forget alaska right
so that way everybody has it has an
equal chance represented so
that is a nice way to do it you can use
zip codes all right so what's our last
example of probability sampling that is
called stratified sampling
now why would we use stratified over
random
as i just said before the greater the
population right the bigger the
population the more variation
this is a better method let's break it
down
so the reason we call this stratified
sampling is because
the word strata means layer
right you might have heard the word
stratified like the stratosphere right
one of the layers of the atmosphere okay
so we have
many layers that we're kind of dissect
okay
so here's our population and imagine
that each one of these colors
represents gender race and income okay
so we'll say
you know uh gender is going to be this
color
and race is going to be this color
and income is this color okay
when you have a population that has a
lot of differences in it okay like a
country
how do you ensure that every group is
representative
because historically a lot of subgroups
are not representative
right people low-income or specific
minority groups it's a
it's really hard to get a hold of those
people you know through mail and
telephone and things like that
so we need a way to make sure that every
group every group of population
is represented and the way to do that is
stratified sampling
and here's how it works each one of
these circles
represents what we call a strata
so here's a strata and here's a strata
and here's a strata so these represents
strata a subgroup okay a subgroup of a
population
okay and the first thing we do instead
of doing random selection from here
we first divide people equally so we're
gonna put all of the
you know all females all women into
our strata okay so we first put him here
and then we will
put everybody who let's say is you know
african-american
in this strata
here we go and then we will put
everybody let's say in the middle class
right people who are middle class
into this strata
okay so we first take everybody in a
population
and we divide them into specific strata
so we'll have an arrow here
arrow here and arrow here okay
all right so now that we've divided
everybody into their specific strata
what do we do we do what we did in our
random sampling which is
then we do a random selection from each
strata okay so here's our
random selection like you know picking
names out of a hat
so we might put you know all you know
women and everybody's african-american
everybody in middle class income
into a database and it randomizes them
so it's an equal chance to be in our
sample
so randomly you know this person has to
be chosen and this person is chosen
and then this person is chosen and then
those three
become part of our sample there's first
person
and second person and our third
person right so this way
using stratified we ensure especially in
a population with a lot of variability
that every specific subgroup is
accounted for
all right so that way we can make better
inferences
about the population and each subgroup
as well and how might this apply to our
question
what percentage of americans support the
legalization of marijuana
well you'd probably do a stratified
sampling because of the size
of america in countries in general right
so you could take
you know we have gender race and income
you could also do
educational background you could do i
don't know age right you can do a lot of
different things to make sure that
everybody is accounted for
and that's a nice way to answer that
question alright guys thanks for
watching i really hope you learned
something
please look down in the comments section
i put an example problem
where you've identified the sample and
population test your knowledge
thanks for watching i'll see you next
time don't forget to like the video
subscribe
take care
Browse More Related Video
![](https://i.ytimg.com/vi/fSmedyVv-Us/hq720.jpg)
Sampling Methods 101: Probability & Non-Probability Sampling Explained Simply
![](https://i.ytimg.com/vi/Zbw-YvELsaM/hq720.jpg)
Sampling Distributions: Introduction to the Concept
![](https://i.ytimg.com/vi/Oz7h1KFl_44/hqdefault.jpg?sqp=-oaymwExCJADEOABSFryq4qpAyMIARUAAIhCGAHwAQH4Af4OgAK4CIoCDAgAEAEYZSBlKGUwDw==&rs=AOn4CLAoZWTldKV-Xz5ZjE5PJxHEIOhTaA)
Spatial Sampling & Interpolation
![](https://i.ytimg.com/vi/be9e-Q-jC-0/hqdefault.jpg?sqp=-oaymwEXCJADEOABSFryq4qpAwkIARUAAIhCGAE=&rs=AOn4CLAzT9r-42w3ztGi1dv4pcU1tmAPlw)
Sampling: Simple Random, Convenience, systematic, cluster, stratified - Statistics Help
![](https://i.ytimg.com/vi/5PsF5MsrCOo/hq720.jpg?v=5f994b6a)
Probability & Non-Probability Sampling Techniques - Statistics
![](https://i.ytimg.com/vi/iQaFDpiNOlA/hq720.jpg)
Sampling Theorem
5.0 / 5 (0 votes)