Statistical Significance versus Practical Significance

jbstatistics
29 Jan 201304:47

Summary

TLDRThis script explores the distinction between statistical and practical significance in hypothesis testing. It explains that statistical significance, determined by a low p-value, indicates it's unlikely the observed sample effect occurred by chance if the null hypothesis is true. However, this doesn't necessarily imply the effect has practical importance. The script uses the example of a call center's average wait time to illustrate how a small difference can be statistically significant due to a large sample size but may not hold practical significance. It emphasizes the role of statisticians in identifying statistical significance and leaving the assessment of practical importance to subject matter experts, also highlighting the importance of reporting confidence intervals to understand the effect size.

Takeaways

  • 🔍 Hypothesis testing is used to determine statistical significance, which indicates the observed effect in a sample is unlikely due to chance alone.
  • 📊 Statistical significance does not necessarily equate to practical significance; the latter is about the real-world impact of the observed effect.
  • ⏱ An example given is a call center's average wait time, where a statistically significant difference of 0.6 seconds may or may not be practically significant.
  • 📉 A small p-value, like 0.002, provides strong evidence against the null hypothesis, suggesting the population mean is different from the hypothesized value.
  • 🔑 The practical importance of a statistically significant result is subjective and depends on the context and the stakeholders' perspective.
  • 🔬 Statistical significance is highly dependent on the sample size; larger samples can find significance in even minor differences.
  • 📈 Conversely, with smaller sample sizes, it might be challenging to detect statistically significant differences, even if they exist.
  • 📊 The script illustrates how different sample sizes affect the outcome of hypothesis testing through box plots, showing insignificant p-values with small samples and significant ones with large samples.
  • 📐 Even with the same effect size, a larger sample size can lead to a statistically significant result, emphasizing the role of sample size in hypothesis testing.
  • 🤔 The practical significance of a statistically significant result is a separate question that should be considered by domain experts.
  • 📊 Reporting a confidence interval alongside the hypothesis test results can provide additional context on the effect size and its potential practical importance.

Q & A

  • What is the difference between statistical significance and practical significance in hypothesis testing?

    -Statistical significance refers to the likelihood that the observed effect in a sample is not due to chance, while practical significance is about whether the observed effect has real-world importance or impact.

  • Why might a small difference in sample mean result in a statistically significant p-value?

    -A small difference can be statistically significant if the sample size is large enough, as even minor deviations from the hypothesized value can be detected with high precision.

  • What is a p-value in the context of hypothesis testing?

    -A p-value is the probability of observing the sample results, or something more extreme, assuming the null hypothesis is true. A small p-value provides strong evidence against the null hypothesis.

  • How does sample size affect the outcome of a hypothesis test?

    -A larger sample size increases the power of the test to detect even small differences as statistically significant, whereas a smaller sample size may not provide enough evidence to reject the null hypothesis even if a practical difference exists.

  • What is the role of a confidence interval in hypothesis testing?

    -A confidence interval provides a range of values within which the true population parameter is likely to fall, and it helps to illustrate the size of the effect, which can be useful in determining practical significance.

  • Why is it important to consider both statistical and practical significance?

    -While statistical significance tells us if the results are likely not due to chance, practical significance is crucial for understanding if the results have meaningful implications in a real-world context.

  • Can a hypothesis test result in a non-significant p-value with a large sample size?

    -Yes, if the effect size is truly small and the variation within the sample is high, even a large sample size might not yield a statistically significant result.

  • What does it mean if the p-value is very small in a hypothesis test?

    -A very small p-value indicates that it is highly unlikely that the observed results occurred by chance if the null hypothesis were true, suggesting strong evidence in favor of the alternative hypothesis.

  • How might a company interpret a statistically significant result with a small effect size?

    -A company might view a statistically significant result with a small effect size as important if it aligns with their goals or operational thresholds, even if the practical impact is minimal.

  • In the given call center example, why might a 0.6-second difference be considered practically insignificant to some but significant to others?

    -The perception of practical significance can vary based on individual or organizational goals. A 0.6-second difference might be negligible in one context but critical for meeting specific service level agreements in another.

  • What is the purpose of hypothesis testing in a research or business context?

    -Hypothesis testing serves to make inferences about a population parameter based on sample data. It helps in decision-making by providing evidence to either accept or reject a claim or assumption about the population.

Outlines

00:00

📊 Statistical vs. Practical Significance in Hypothesis Testing

This paragraph discusses the concept of statistical significance versus practical significance in the context of hypothesis testing. It explains that statistical significance indicates the observed effect in a sample is unlikely due to chance, providing strong evidence against the null hypothesis. However, the practical significance, or the real-world importance of the observed effect, is a separate consideration. The example of a call center's average wait time illustrates how a statistically significant difference (0.6 seconds) might not be practically significant to everyone but could be to the company. The paragraph also highlights the influence of sample size on statistical significance, demonstrating that with a large enough sample size, even minor differences can be statistically significant, while with a smaller sample size, it might be challenging to find such differences.

Mindmap

Keywords

💡Statistical Significance

Statistical significance refers to the probability that the observed results of a study are not due to chance. In the video, it is used to determine whether the effects observed in the sample could have occurred by random chance if the null hypothesis were true. The example of a p-value of 0.002 indicates strong evidence against the null hypothesis, suggesting that the population mean wait time is indeed greater than 30 seconds.

💡Practical Significance

Practical significance is about whether the observed effect has real-world implications or is meaningful in a practical context. The script contrasts this with statistical significance, noting that a small difference of 0.6 seconds in the call center example might not be practically significant to most people, but could be important to the company.

💡Hypothesis Testing

Hypothesis testing is a statistical method used to make decisions about the population based on sample data. The video script discusses testing the null hypothesis that the population mean wait time is 30 seconds against the alternative hypothesis that it is greater than 30 seconds, using sample data to make this determination.

💡Null Hypothesis

The null hypothesis is a statement of no effect or no difference that is tested in an experiment. In the script, the null hypothesis is that the average wait time is 30 seconds, which is then tested against the alternative hypothesis that the mean is greater.

💡Alternative Hypothesis

The alternative hypothesis is a statement that contradicts the null hypothesis, proposing an effect or difference. The script presents it as the hypothesis that the population mean wait time is greater than 30 seconds, which is what the sample data suggests.

💡P-Value

The p-value is the probability of obtaining results as extreme as the observed results, assuming the null hypothesis is true. A small p-value, like the 0.002 mentioned in the script, indicates strong evidence against the null hypothesis, suggesting that the observed difference is statistically significant.

💡Sample Size

Sample size refers to the number of observations or elements included in a sample. The script explains that statistical significance is strongly related to sample size, with larger sample sizes making even small differences statistically significant, as illustrated in the box plot examples.

💡Confidence Interval

A confidence interval provides a range of values that are likely to contain the population parameter, with a certain level of confidence. The script suggests that reporting a confidence interval alongside hypothesis test results can help determine the practical significance of the effect by illustrating its size.

💡Effect Size

Effect size is a measure of the magnitude of the difference or effect in the study. The script implies that the confidence interval can help illustrate the effect size, which is important for assessing whether the observed differences have practical importance.

💡Variance

Variance is a measure of the spread or dispersion of a set of data points. In the script, variance is set to be equal in different situations to ensure a fair comparison when testing the null hypothesis with different sample sizes.

💡Box Plot

A box plot is a graphical representation of the distribution of a dataset, showing the median, quartiles, and potential outliers. The script uses box plots to visually compare the impact of sample size on the statistical significance of observed differences from the hypothesized value.

Highlights

Hypothesis testing is used to determine statistical significance, which assesses if an observed effect in a sample is likely due to chance.

Statistical significance does not necessarily imply practical significance, which concerns the real-world importance of an observed effect.

A practical example is given involving a call center's average wait time, where a statistically significant difference may not be practically significant.

A small p-value, such as 0.002, indicates strong evidence against the null hypothesis, suggesting the observed effect is unlikely due to chance.

Even a small difference, like 0.6 seconds in the call center example, can be statistically significant but may not have practical implications.

The importance of a difference is subjective and can vary based on the context or the company's goals.

Statistical significance is influenced by sample size; larger samples can detect smaller differences as statistically significant.

Small sample sizes may struggle to find statistically significant differences, even if they are present.

Six different scenarios illustrate the impact of sample size on statistical significance, with the same variance but different sample sizes.

In the first scenario with a sample size of 10, even small deviations from the hypothesized mean do not yield a statistically significant result.

Increasing the sample size to 5,000 in the same scenarios results in very small p-values, indicating strong statistical significance.

The practical significance of the observed differences remains a separate question from their statistical significance.

The impact of a 0.1 difference from the hypothesized value can vary greatly depending on the context, such as drug effectiveness or marketing.

A small p-value only tells us that the observed difference is unlikely if the null hypothesis is true, not its practical importance.

In statistics, the role is to determine statistical significance, leaving the assessment of practical importance to field experts.

Reporting a confidence interval alongside hypothesis test results can provide insight into the effect size and its potential practical significance.

Transcripts

play00:01

Let's take a look at statistical significance versus practical significance in hypothesis testing.

play00:07

play00:08

Hypothesis testing tests for statistical significance.

play00:12

Statistical significance means the effect observed in the sample was unlikely to have occurred due to chance alone.

play00:19

Or in other words, it would be very unlikely to see what was observed in the sample

play00:24

play00:24

if the null hypothesis is in fact true.

play00:29

Whether that effect has any practical importance is an entirely different question.

play00:35

Suppose a call center claims the average wait time is 30 seconds.

play00:40

play00:40

We decide to test the null hypothesis that the population mean is 30

play00:44

against the alternative hypothesis that the mean is greater than 30.

play00:48

We go ahead, get a sample, find a sample mean of 30.6 seconds,

play00:52

and carry out a test and get a resulting p-value of 0.002.

play00:59

A p-value of that size gives very very strong evidence against the null hypothesis

play01:04

and very strong evidence in this case that

play01:06

the population mean is actually greater than 30 seconds.

play01:10

But in the sample we observed this sample mean to be 30.6 seconds,

play01:14

play01:14

a difference of only 0.6 seconds from the null hypothesis.

play01:18

I think in this situation most people would feel that that difference of 0.6 seconds

play01:22

doesn't really have any practical meaning.

play01:25

But perhaps the company feels differently.

play01:27

Perhaps the company wants to make sure the wait time is no more than 30 seconds on average,

play01:32

play01:32

and so they feel that 0.6 seconds is an important difference, we simply don't know.

play01:37

What we can say in statistics is that this p value is very small,

play01:42

giving strong evidence against the null hypothesis.

play01:46

And we'll leave it up to others to decide if that has practical importance.

play01:51

play01:51

How could there be such strong evidence against the null hypothesis

play01:54

if the difference from the hypothesized value is so small?

play01:54

play02:01

play02:02

Well, statistical significance is strongly related to sample size.

play02:06

If the sample size is large enough,

play02:08

even tiny differences from the hypothesized value will be found statistically significant.

play02:15

On the flip side of things, if the sample size is small,

play02:18

then it can be difficult to find a statistically significant difference.

play02:18

play02:23

play02:24

To illustrate, suppose we're testing the null hypothesis that mu=10 in six different situations.

play02:31

I've set the variance to be exactly equal in the six different spots to level the playing field.

play02:37

In this first set of box plots, the sample size is 10 for each of the three samples.

play02:37

play02:42

play02:42

I've plotted in in the hypothesized value of 10 here with a red line.

play02:47

The first sample had a sample mean of 10.05,

play02:50

the second one had a sample mean of 10.1

play02:54

and the third one had a sample mean of 10.2.

play02:58

The p-value of the test is very large in all three spots,

play03:02

indicating no evidence against the null hypothesis.

play03:08

In this second set of boxplots, everything is exactly the same,

play03:11

other than the sample size is now five thousand instead of 10.

play03:15

The sample means are all exactly the same as what they were above.

play03:19

When we carry out the test, the p-values here are tiny,

play03:22

indicating very very strong evidence against the null hypothesis.

play03:28

That much much larger sample size

play03:31

has found those differences to be highly statistically significant.

play03:38

But whether these differences from the hypothesized value

play03:41

have any practical meaning whatsoever is an entirely different question.

play03:45

If we were to look at this sample mean of 10.1 for a moment,

play03:49

maybe that difference of 0.1 from the hypothesized value

play03:52

will result in a more effective drug or better marketing campaign,

play03:56

or maybe in the context of the problem a difference of 0.1 is completely and utterly meaningless.

play04:03

The tiny p-value simply tells us

play04:06

that it was very unlikely to observe a difference of this size

play04:10

if the null hypothesis is actually true.

play04:16

In statistics we determine if there is statistical significance

play04:20

and let experts in the field of interest determine whether the results have any practical importance.

play04:28

play04:28

In addition to the results of a hypothesis test it is best

play04:31

to also report an appropriate confidence interval.

play04:34

The confidence interval illustrates the size of the effect,

play04:38

and can help to determine if the effect has any practical importance.

play04:42

Rate This

5.0 / 5 (0 votes)

関連タグ
Hypothesis TestingStatistical SignificancePractical SignificanceSample SizeP-ValueEffect SizeConfidence IntervalData AnalysisResearch MethodsStatistical EvidenceMeaningful Difference
英語で要約が必要ですか?