P-values Broke Scientific Statistics—Can We Fix Them?
Summary
TLDRThis SciShow episode delves into the controversial world of p-values in scientific research. It begins with a humorous anecdote about a dead salmon 'completing' a mental task, illustrating the potential for statistical misinterpretation. The video explains p-values, their origin, and common misuse, highlighting the arbitrary nature of the 0.05 threshold. It discusses the implications of multiple comparisons and the rise of 'p-hacking', suggesting alternatives like Bayesian statistics and two-step manuscript submission as potential solutions to improve the reliability and clarity of scientific findings.
Takeaways
- 🧠 A neuroscientist conducted a study involving a dead salmon in an MRI to demonstrate the potential misuse of statistics in scientific research.
- 📊 The concept of 'p-value', introduced by Ronald Fisher in 1925, is central to understanding statistical significance in research, representing the probability of obtaining results by chance.
- 🔍 P-values are frequently misused and misunderstood, leading to debates about their reliability and the necessity of using them in scientific studies.
- 🎯 The 'null hypothesis' is a fundamental part of statistical testing, where researchers assume no effect and then test the likelihood of observed data under this assumption.
- 🚫 The arbitrary threshold of p < 0.05 is often used to determine statistical significance, despite being a subjective choice by Fisher and not always indicative of true effects.
- 🐟 The 'zombie fish' example illustrates how multiple comparisons can lead to false positives, emphasizing the need for caution in data interpretation.
- 🔄 Multiple comparison corrections are statistical methods used to adjust for the increased chance of false positives when many comparisons are made.
- 📈 P-hacking is a problematic practice where researchers manipulate data or analysis methods to achieve significant p-values, potentially distorting scientific findings.
- 📊 Bayesian statistics is an alternative approach that considers the probability of both the null and alternative hypotheses, offering a different perspective on statistical evidence.
- 🔄 The proposal to abandon p-values in favor of other statistical methods reflects a broader discussion on improving the rigor and transparency of scientific research.
Q & A
What was the unusual experiment conducted with a dead Atlantic salmon?
-The experiment involved placing a dead Atlantic salmon in an MRI machine and asking it to determine emotions in photos of people, which resulted in significant activation in the neural tissue of the dead fish. This was actually a stunt to make a point about the misuse of statistics.
What is a p-value in the context of scientific research?
-A p-value represents the probability that the observed results of a study would occur by chance alone. It is a measure used to determine the significance of the findings, with a common threshold for significance being a p-value less than 0.05.
Who first proposed the concept of p-values and why?
-Ronald Fisher first proposed the concept of p-values in 1925. He was interested in determining if the results of a study were meaningful beyond chance, and his idea was sparked by an experiment involving the ability to taste the difference in the order of milk and tea in a cup.
What is the null hypothesis in statistical testing?
-The null hypothesis is the assumption in an experiment that there is no effect or relationship between the variables being tested. It represents the 'by chance' scenario against which the alternative hypothesis is tested.
Why is the threshold of 0.05 for p-values considered arbitrary?
-The threshold of 0.05 for p-values is considered arbitrary because it was based on Fisher's personal preference and was described as 'convenient' in his 1925 book. It was later admitted to be somewhat subjective.
What is the issue with using a p-value of 0.049 versus 0.051 to determine significance?
-Using a p-value of 0.049 versus 0.051 to determine significance is problematic because a p-value does not indicate the truth of an alternative hypothesis. A slightly lower p-value does not make a result more correct; it only indicates a slightly lower probability of the results occurring by chance.
What is p-hacking and why is it a concern in scientific research?
-P-hacking is the practice of manipulating data analysis or collection methods to achieve a significant p-value. It is a concern because it can lead to the publication of false positives and undermines the reliability of scientific findings.
What are multiple comparison corrections and why are they important?
-Multiple comparison corrections are statistical techniques used to adjust the significance threshold when making multiple comparisons in a single study. They are important to control the family-wise error rate and reduce the likelihood of false positives.
Why have some researchers and journals called for the abandonment of p-values?
-Some researchers and journals have called for the abandonment of p-values due to their frequent misuse and misunderstanding, which can lead to incorrect conclusions about the significance of study results. They argue for alternative statistical methods that provide a more nuanced understanding of the data.
What is Bayesian statistics and how does it differ from the use of p-values?
-Bayesian statistics is an alternative approach that considers the probability of both the null and alternative hypotheses, providing a ratio (Bayes factor) of how likely one explanation is compared to another. Unlike p-values, which only examine the null hypothesis, Bayesian statistics offer a more direct measure of the probability of the alternative hypothesis being true.
What is the two-step manuscript submission process and how might it improve scientific publishing?
-The two-step manuscript submission process involves submitting an introduction and method description first, with the journal deciding on publication before seeing the results. This approach aims to publish studies based on the quality of the science rather than the significance of the results, potentially reducing the pressure to achieve arbitrary statistical thresholds.
Outlines

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowBrowse More Related Video

Why the p-Value fell from Grace: A Deep Dive into Statistical Significance

The scandal that shook psychology to its core

The Times and Troubles of the Scientific Method

5 Controversial Psychology Experiments That Would Never Happen Today

The Most Incredible Snowfall on Earth Occurs Deep Underwater

Is Most Published Research Wrong?
5.0 / 5 (0 votes)