Hypothesis Testing - One Sample Proportion

Dan Kernler
20 Jan 202211:31

Summary

TLDRIn this educational video, Professor Dan Curler from Elgin Community College introduces hypothesis testing for proportions. He discusses entry-level math classes and explores whether high school success is predictive of college performance. Using a sample of 63 students, he demonstrates the hypothesis testing process, including setting up null and alternative hypotheses, calculating the test statistic, and interpreting the p-value and z-score. The video also covers practical significance versus statistical significance, emphasizing the importance of context in interpreting results.

Takeaways

  • 🎓 Professor Dan Curler introduces a video on hypothesis testing about a proportion in a statistics series.
  • 📚 Elgin Community College offers various entry-level math classes including statistics, gen ed math, math for elementary educators, and college algebra.
  • 📈 The college has different pathways to enter math classes, such as Math 095, Math 98, Math 99, SAT or ACT scores, and recently, high school GPA.
  • 🤔 The video poses a question about whether students with high school success are more likely to succeed in college, focusing on a specific group admitted based on high school GPA.
  • 📊 A sample proportion of 79% success rate is compared to an overall college success rate of 71%, prompting a hypothesis test.
  • 📘 The criteria for performing a hypothesis test are explained, including the requirement that n*p*(1-p) should be at least 10 and the sample should be less than 5% of the population.
  • 📉 The mean and standard deviation of the sample proportion are calculated, setting the stage for hypothesis testing.
  • 🔢 A z-score of 1.46 is computed, indicating how many standard deviations the sample proportion is from the population proportion.
  • 📝 The six steps of hypothesis testing are outlined, including defining hypotheses, determining alpha, computing the test statistic, finding the p-value and critical value, making a decision, and drawing a conclusion.
  • 🚫 The conclusion for the first example is that there is not enough evidence at the 0.05 level to support the claim that the proportion is more than 0.71, due to a small sample size.
  • 📊 StatCrunch software is demonstrated for conducting hypothesis tests, including how to input data and interpret results.
  • 🗳️ A second example involving voter registration rates among children of immigrants is presented, showing a statistically significant difference compared to the general population.
  • 🤔 The importance of distinguishing between statistical significance and practical significance is highlighted, emphasizing the need to consider the meaningfulness of results.

Q & A

  • What is the main topic of Professor Dan Curler's video?

    -The main topic of the video is hypothesis testing about a proportion in statistics.

  • What are the four entry-level college math classes mentioned?

    -The four entry-level college math classes mentioned are statistics, general education math, math for elementary educators (Math 110), and college algebra.

  • What are the different ways students can get into these math classes?

    -Students can get into these math classes through Math 095 (preparation for gen ed math), Math 98 (intermediate algebra), Math 99 (combined beginning and intermediate algebra), SAT/ACT scores, ALEKS placement exam, or using their high school GPA.

  • What was the success rate for students who entered using only their high school GPA and a fourth year of math?

    -The success rate for students who entered using only their high school GPA and a fourth year of math was 50 out of 63 students, or 79%.

  • What is the population proportion used for comparison in the hypothesis test?

    -The population proportion used for comparison in the hypothesis test is 71%.

  • What is the z-score and p-value obtained in the hypothesis test?

    -The z-score obtained in the hypothesis test is 1.46, and the p-value is 0.072.

  • What is the critical value for the test?

    -The critical value for the test, with an alpha level of 0.05, is 1.645.

  • What conclusion did Professor Curler reach regarding the hypothesis test?

    -Professor Curler concluded that there was not enough evidence at the 0.05 significance level to support the claim that the proportion is more than 71%, meaning the sample proportion of 79% was not statistically significant.

  • What is the distinction between statistical significance and practical significance?

    -Statistical significance means that the result is unlikely to have occurred by chance, while practical significance refers to whether the result has meaningful real-world implications. For example, a small difference in proportions may be statistically significant but not practically meaningful.

  • What tool does Professor Curler use to perform the hypothesis testing?

    -Professor Curler uses StatCrunch to perform the hypothesis testing about proportions.

Outlines

00:00

📚 Introduction to Hypothesis Testing for Proportions

Professor Dan Curler from Elgin Community College introduces a video on hypothesis testing for proportions. He begins with an overview of the math classes available at the college and the various pathways to enroll in them, including the use of high school GPA as a placement criterion. The professor then presents a scenario where he investigates whether students with high school success are more likely to succeed in college. Using a sample of 63 students with a success rate of 79%, he compares it to the overall college success rate of 71%. The video explains the criteria for conducting a hypothesis test, which includes ensuring the sample size is large enough and represents less than 5% of the population. The professor demonstrates how to calculate the mean and standard deviation for the sample proportion and how to determine the p-value and z-score for the test.

05:02

🔍 Hypothesis Testing Steps and StatCrunch Application

Continuing the discussion on hypothesis testing, Professor Curler outlines the six steps involved in the process, emphasizing the importance of defining null and alternative hypotheses, selecting an alpha level, computing the test statistic, and determining the p-value or critical value. He uses the example of the sample proportion of 79% not being statistically significant compared to the 71% college-wide rate due to the small sample size. The professor then demonstrates how to perform a hypothesis test using StatCrunch software, showing how to input data, set up the test, and interpret the results, including the calculation of the z-test statistic and p-value. He also discusses another example involving voter registration rates among children of immigrants compared to the general population, highlighting the difference between statistical significance and practical significance.

10:02

📉 Understanding Statistical vs. Practical Significance

In the concluding part of the video, Professor Curler emphasizes the distinction between statistical significance and practical significance. He illustrates this with the voter registration example, where a 6% difference between the sample statistic and the comparison rate is both statistically and practically significant. The professor advises viewers to consider the meaningfulness of their results, questioning whether a statistically significant result has practical implications. He wraps up the video by inviting viewers to subscribe for more content on hypothesis testing and thanks the Elgin Community College Board of Trustees for supporting his sabbatical, which enabled him to create the video series.

Mindmap

Keywords

💡Hypothesis Testing

Hypothesis testing is a statistical method used to make decisions about a population parameter based on a sample of data. It involves setting up a null hypothesis and an alternative hypothesis and then determining whether the sample data supports the alternative hypothesis over the null. In the video, hypothesis testing is used to determine if the proportion of successful students with a high school GPA is significantly different from the overall success rate.

💡Proportion

A proportion in statistics refers to the ratio of the number of successes to the total number of trials or observations. It is a way of expressing a fraction that compares part to whole. In the context of the video, the professor is testing the hypothesis about the proportion of students who are successful in their classes based on different criteria for admission.

💡Elgin Community College

Elgin Community College is the institution where Professor Dan Curler works and where the video is set. It is relevant to the video as it provides the context for the classes and the student population being discussed. The script mentions specific math classes offered at Elgin Community College and how students can get into these classes.

💡Entry-Level Math Classes

These are introductory college-level math courses that students typically take at the beginning of their academic journey. In the video, the professor lists several entry-level math classes at Elgin Community College, such as Statistics, Gen Ed Math, Math 110, and College Algebra, which are part of the educational offerings being discussed.

💡Placement Exam

A placement exam is a test used by educational institutions to determine a student's level of proficiency in a particular subject and to place them in the appropriate course. In the script, the professor discusses different ways students can get into math classes, including through SAT or CT scores, the ALEKS placement exam, or high school GPA.

💡Sample Proportion

A sample proportion is the ratio of the number of successes in a sample to the total number of observations in the sample. It is used as an estimate of the population proportion. In the video, the sample proportion of successful students with a high school GPA is calculated to be 79%, which is then compared to the overall success rate.

💡Statistical Significance

Statistical significance refers to the probability that the observed results occurred by chance. If the results are statistically significant, it means they are unlikely to have happened by random chance and can be considered reliable. The professor explains that even though the sample proportion of 79% is higher than the overall rate, it is not statistically significant due to the small sample size.

💡Z-Score

A z-score is a measure of how many standard deviations an element is from the mean. It is used in hypothesis testing to determine the probability of an event occurring by chance. In the video, the professor calculates a z-score of 1.46 to determine the likelihood of observing the sample proportion of 79% or higher.

💡P-Value

The p-value is the probability of obtaining results at least as extreme as the observed results, assuming the null hypothesis is true. A small p-value (typically ≤ 0.05) indicates strong evidence against the null hypothesis. The professor mentions a p-value of 0.072, which is used to decide whether to reject the null hypothesis.

💡Critical Value

A critical value is the value of a test statistic that determines the threshold for rejecting the null hypothesis. If the test statistic is greater than the critical value, the null hypothesis is rejected. The professor refers to the critical value method and mentions a z-value of 1.645 as the threshold for a 0.05 level of significance.

💡Statcrunch

Statcrunch is a statistical software package that can perform a variety of statistical analyses, including hypothesis testing. In the video, the professor demonstrates how to use Statcrunch to conduct a one-sample proportion test, providing a practical example of applying the concepts discussed.

💡Practical Significance

Practical significance refers to the real-world importance or meaningfulness of the results of a study, beyond just statistical significance. The professor emphasizes the importance of considering practical significance, using the example of a 6% difference in voter registration rates, which is both statistically and practically significant.

Highlights

Professor Dan Curler introduces a video on hypothesis testing about a proportion.

Discusses math classes at Elgin Community College and how to enroll in them.

Explains the prerequisites for hypothesis testing using the normal distribution.

Calculates the mean and standard deviation for the sample proportion.

Determine the p-value and z-score for the sample proportion of 79%.

Outlines the six steps of hypothesis testing, including defining hypotheses and determining alpha.

Uses a real-world example involving high school GPA and college success rates.

Analyzes the sample size and its statistical significance in hypothesis testing.

Demonstrates how to perform hypothesis testing using StatCrunch software.

Presents an example from the US Census on voter registration rates.

Explains the difference between statistical significance and practical significance.

Concludes that there is not enough evidence to support the claim that the proportion is more than 0.71 based on the sample.

Shows how to use StatCrunch for one-sample proportion tests with summary data.

Discusses the importance of considering the practical implications of statistical results.

Provides a method to perform hypothesis testing for proportions in StatCrunch with data exclusions.

Concludes that there is enough evidence to support the claim that the proportion of registered voters is different for children of immigrants.

Emphasizes the importance of understanding the practical significance of statistical results beyond just statistical significance.

Transcripts

play00:00

hello this is professor dan curler of

play00:01

elgin community college back with

play00:03

another video in my statistics series in

play00:05

this one we're going to dive into the

play00:06

specifics of hypothesis testing about a

play00:08

proportion okay let's get to it

play00:11

[Music]

play00:15

we're going to start off today actually

play00:16

talking about elgin community college

play00:18

and some of the math classes we have and

play00:20

how you can get into those classes and

play00:22

then we'll do a little hypothesis

play00:24

testing so we have four kind of

play00:27

entry-level college-level math classes

play00:29

we have statistics

play00:30

gen ed math we have math 110 which which

play00:33

is math for elementary educators and

play00:35

then we have college algebra there's a

play00:37

variety of ways you can get into these

play00:39

math 095 is basically preparation for

play00:43

gen ed math so that just gets you into

play00:45

102 and 104. math 98 is intermediate

play00:48

algebra that can get you into all of

play00:50

these there's also math 99 which is a

play00:52

combined beginning and intermediate

play00:55

then there's you can get an sat or a ct

play00:58

score that's high enough same thing on

play01:00

the aleks placement exam and then

play01:02

relatively recently we added this high

play01:04

school gpa which was a statewide

play01:07

initiative there was legislation that

play01:08

was passed we had been investigating

play01:10

this as well we know students that do

play01:12

well in high school are likely to do

play01:13

well in college as well so what if i

play01:16

wonder if that last group who did well

play01:19

in high school actually is more likely

play01:21

to succeed in college than the other

play01:24

groups that were in there

play01:26

well what we could do is we could look

play01:28

at the overall success rate in these

play01:29

classes it's about 71 percent and then

play01:33

we have a sample of students who've come

play01:35

into ecc and they got into those classes

play01:38

without any placement exam without any

play01:40

set or act they just had their high

play01:42

school gpa score plus you had to have um

play01:46

a fourth year math class as well and of

play01:49

those 50 out of 63 were successful in

play01:53

one of those four classes that they took

play01:55

this was in a single semester i believe

play01:57

it can't remember exactly when this was

play01:59

implemented might have been fall 2020.

play02:02

so we have a sample proportion then is

play02:04

79 percent clearly higher but only 63

play02:08

maybe it's not statistically higher

play02:10

so let's test it and we'll go through

play02:12

this hypothesis testing process remember

play02:15

this only works we can only do this

play02:17

hypothesis test

play02:18

if we have this criteria met that n

play02:20

times p

play02:21

times 1 minus p is at least 10 and we

play02:24

have less than 5 percent of the

play02:25

population if those conditions are met

play02:28

then we'll fit this normal distribution

play02:30

so we have 63 here the p proportion will

play02:34

be the 0.71 plug those in yes that is at

play02:36

least 10 and we do have less than 5

play02:39

percent of all possible students taking

play02:41

these classes

play02:43

okay so if those conditions are met then

play02:45

the mean of the sample proportions will

play02:47

be the same as the population proportion

play02:49

and the standard deviation will be

play02:50

square root of p times 1 minus p all

play02:52

over n fill in those values we get 0.71

play02:56

is our proportion and we get a standard

play02:58

deviation about 0.0572

play03:00

and now that we have the mean 0.71 and

play03:03

the standard deviation we can look at

play03:05

our value that we have here we have 71

play03:07

percent that we're comparing with and

play03:09

then 50 or 79 percent is our sample we

play03:12

can figure out where those go on the

play03:14

curve so 50 over 63 is over here the

play03:17

p-value be the probability of getting

play03:19

that value or more extreme in this case

play03:22

that's about 0.072

play03:25

we could also compute a z-score here 50

play03:28

over 63 minus the p over the standard

play03:31

deviation we get about 1.46

play03:36

that's a z-score we can treat that as a

play03:38

z and do the same thing this would be if

play03:40

we wanted to do the critical value

play03:42

method to do the critical value method

play03:44

we need the value that has .05 to the

play03:47

right or whatever our alpha is in this

play03:49

case z point zero five is one point six

play03:52

four five

play03:54

so now let's go through those six steps

play03:57

we have to first define the null and

play03:59

alternative hypotheses our null

play04:01

hypothesis is that the proportion is the

play04:03

same as it was for the other groups for

play04:04

the alternative remember there are three

play04:07

possibilities here so here's the

play04:09

distribution under the null hypothesis

play04:11

now we could suspect hey do we think

play04:14

it's greater than that

play04:16

or do we think it's less than that are

play04:18

we not sure could it be less than or

play04:20

greater than so we put it not equal to

play04:22

so there's three possibilities for the

play04:23

alternative

play04:24

in our case we were wondering i was

play04:26

wondering if these students did better

play04:29

so i was wondering if the success rate

play04:30

was higher so greater than for this

play04:32

particular example

play04:34

now we have the alpha determine alpha

play04:36

the level of significance a good default

play04:38

choice here is 0.05 you don't have to

play04:40

choose that uh determine the compute the

play04:44

test statistic so that's this z stats

play04:47

test statistic in this case you take the

play04:49

sample minus its mean which would be the

play04:52

population proportion and divide by the

play04:54

standard deviation and that's the 1.46

play04:57

for the p-value you're just going to

play04:59

find the probability of being to the

play05:01

right of that 50 over 63 and that's

play05:04

0.072

play05:06

for the critical value we need to

play05:08

convert these to z's so we have our 1.46

play05:11

we need to find the z with 0.05 to the

play05:13

right

play05:14

and that was

play05:15

1.645

play05:17

all right we have those two in there now

play05:19

now we need to make our decision do we

play05:20

reject the null hypothesis or not

play05:23

in both cases we look at the p value

play05:26

it's not above our not below our

play05:27

threshold and the 1.46 is not in the

play05:31

critical region it's not above 1.645

play05:34

so that would mean we do not reject the

play05:36

p equals 0.71 now this is really

play05:40

important

play05:42

i'm using the language we're not going

play05:44

to reject the null hypothesis

play05:46

the null hypothesis that the proportion

play05:48

for this group is also 0.71 might not be

play05:51

correct it might be 0.72 0.74 it might

play05:55

even be 0.84

play05:57

we don't know

play05:58

all we can say is that we're not going

play06:01

to reject that it's 0.71

play06:04

okay that's all we can say and when we

play06:06

have our conclusion we say okay there's

play06:09

not enough evidence at the 0.05 level to

play06:11

support the claim that the proportion is

play06:14

more than 0.71

play06:16

so we had 79 was our sample proportion

play06:19

but because we only had a small sample

play06:20

size of 63 that wasn't statistically

play06:23

significant we do not have enough

play06:25

evidence to say that it's higher than 71

play06:27

percent

play06:29

all right let's talk about how to do

play06:30

this in statcrunch this is going to be

play06:32

stat proportion stats one sample and

play06:35

here we're going to do with summary it

play06:37

feels like we have data but we actually

play06:39

just have the summary we have that 50

play06:40

out of 63. so now we go and we enter in

play06:44

our counts 50 out of 63 for the

play06:46

proportion we're going to do we're going

play06:47

to compare to 71.71

play06:50

and then we'll hit compute and we'll see

play06:52

our p value in fact we'll see our z test

play06:55

statistic there as well here's another

play06:57

example i found this information from

play06:59

the us census that of those who are

play07:03

eligible to vote 71 percent were

play07:06

registered to vote for the nas the last

play07:08

presidential election

play07:10

so we might wonder we have our children

play07:12

of immigrants database and i wonder if

play07:16

the proportion of those who are eligible

play07:18

to vote

play07:20

who registered is different now it's

play07:21

important to note we have their

play07:23

citizenship status they're not all

play07:25

eligible to vote we have 84 percent of

play07:28

them are eligible to vote so the

play07:31

question we're going to ask we're going

play07:32

to ask this before looking at the data

play07:34

is is the proportion of eligible voters

play07:36

who are registered different for

play07:38

children of immigrants than for the

play07:40

general population so pay attention to

play07:42

that phrasing if we look at actually i

play07:45

made a graph here for the proportion of

play07:48

those who are eligible who are

play07:50

registered and it's actually 77 percent

play07:54

now keep in mind we've talked about this

play07:56

before we should read the details of

play07:57

this database and how these data were

play07:59

collected it's possible that this sample

play08:02

doesn't represent all children of

play08:04

immigrants we had a much higher

play08:06

education rate but it it we the only

play08:09

thing we can do is take it at face value

play08:11

so we have a 77 voter registration

play08:15

rate for those who are eligible to vote

play08:17

in this children of immigrants database

play08:19

let's do this in statcrunch this is a

play08:21

little tricky so we're going to go to

play08:24

stat proportion stats one sample with

play08:27

data

play08:28

the variable we want to look at here is

play08:31

registered so we'll scroll all the way

play08:32

down but this is a little tricky we want

play08:34

to just pick those who are eligible so

play08:38

we have to exclude those who are not

play08:40

citizens and we can do that where

play08:42

there's this where box so we'll go in

play08:44

we're going to build a formula

play08:46

and we're going to build a formula where

play08:48

if you scroll down that's their current

play08:50

citizenship so we want citizen now

play08:53

is not equal to so it's a little where

play08:55

it's an exclamation point and equal to

play08:57

is not equal to

play08:59

and then not a citizen

play09:01

okay so a little tricky there but then

play09:03

we can do our null hypothesis uh p

play09:06

equals 0.7 alternative will be not equal

play09:09

to and then we'll go down and hit

play09:11

compute

play09:12

and we have our results so we have our

play09:14

test statistic pretty high here

play09:17

7.51 for the p value we'll go take a

play09:20

look at the statcrunch output again

play09:23

and we can see with a z of 7.51

play09:26

you're going to that's basically off the

play09:28

scale remember three standard deviations

play09:29

each way is 99.7 percent so we're just

play09:32

going to have p value less than .001

play09:35

so here we did have an extreme

play09:38

observation

play09:39

so we would reject the null hypothesis

play09:42

so our conclusion then

play09:44

is in this case there is enough evidence

play09:46

at the 0.05 level of significance to

play09:48

support the claim that the proportion of

play09:50

eligible voters who are registered is

play09:52

different for children of immigrants so

play09:54

in this case there is enough evidence to

play09:57

support our alternative claim one last

play10:00

little note i want to make here it's

play10:02

important to understand the difference

play10:03

between statistical significance and

play10:05

practical significance you have to look

play10:08

at your sample statistic and the one

play10:10

you're comparing with and just because

play10:12

they're statistically significant

play10:14

doesn't mean that it has any practical

play10:16

meaning in this case we had a sample

play10:18

proportion of eligible voters who were

play10:21

registered with 77 percent and the

play10:23

comparison was 71 percent in my opinion

play10:26

that difference of six percent is pretty

play10:28

significant that has some meaning it was

play10:30

statistically significant but it seems

play10:32

to be practical whereas if you had a

play10:34

difference between 71 percent and 72

play10:37

percent yeah it might be statistically

play10:39

significant but does it actually have

play10:41

any meaning so be sure you're kind of

play10:43

thinking deeply about your results and

play10:45

just because you get a statistically

play10:46

significant result does it actually mean

play10:48

anything is there actually

play10:50

is a is there actually a meaningful

play10:52

difference between those all right that

play10:54

is it for this video on hypothesis

play10:56

testing about proportions i hope this

play10:57

was helpful if you're interested in

play10:59

seeing more of these you can subscribe

play11:01

hit the bell to get notified we've got a

play11:02

whole series of these coming out a bunch

play11:04

more about different hypothesis tests as

play11:07

always thank you to the elgin community

play11:08

college board of trustees who approved

play11:10

my sabbatical for the spring 2021

play11:12

semester and that's how i was able to

play11:14

record all these videos for you and

play11:16

thank you so much for watching i will

play11:18

see you in the next one

play11:30

you

Rate This

5.0 / 5 (0 votes)

Связанные теги
Hypothesis TestingStatistics EducationCollege MathPlacement ExamsSuccess RatesSample ProportionsZ-Score AnalysisStatcrunch TutorialVoter RegistrationStatistical SignificancePractical Significance
Вам нужно краткое изложение на английском?