MINI LESSON 7: P-Values and P-Value Hacking: a simplified lecture.

N N Taleb's Probability Moocs

22 May 202109:25

Summary

TLDRIn this lecture, the speaker expresses skepticism about the p-value, a statistical concept often used in hypothesis testing. They argue that p-values lack a solid probabilistic basis and are frequently misunderstood. The speaker emphasizes that metrics are often stochastic and can be influenced by survivorship bias, leading to skewed results. They explain that p-values are stochastic and may not accurately represent the probability of a true effect. The lecture warns against the misuse of p-values and suggests considering alternative statistical methods, particularly in fields like psychology where sample sizes are often small.

Takeaways

😕 The speaker expresses skepticism about the p-value, suggesting it lacks a solid probabilistic basis despite its widespread use.
🔍 The lecture emphasizes that metrics are often stochastic variables, meaning they can vary with each sample, especially when the sample size is small.
📚 It's highlighted that stochastic variables can lead to survivorship bias, where only the most successful outcomes are observed, skewing the understanding of the variable's distribution.
📉 The p-value is explained as the probability of observing a statistic as extreme as, or more extreme than, the one observed, under the null hypothesis.
🤔 The speaker points out that the p-value is itself stochastic and can be skewed, with most values falling below the true mean.
🧐 The p-value's problem is exacerbated by the fact that it does not account for the sample size 'n', which is crucial for understanding its significance.
🔬 The speaker warns against 'gaming the metric' by running multiple experiments and taking the maximum p-value, which can lead to misleading results.
📈 The distribution of the maximum p-value is lower than the true p-value, suggesting that repeated experiments can artificially lower the p-value and misrepresent significance.
📚 Reference is made to the speaker's book 'Statistical Consequences of Fat Tails', where further discussion on this topic can be found.
🚫 The speaker advises caution when using p-values and suggests considering alternative methods for statistical analysis.
🤓 The preference for p-values among psychologists is critiqued, implying that larger sample sizes in other fields might offer more robust results.

Q & A

What is the speaker's opinion on p-values?
-The speaker is not in favor of p-values, considering them to be a concept without a strong probabilistic basis and not very solid, despite their widespread use.
What are the two central points the speaker wants to recap from the session on correlation and metrification?
-The first point is that metrics are often stochastic variables that converge to something by the law of large numbers but can vary in sample, especially with small sample sizes. The second point is the presence of survivorship bias in stochastic variables, where one may only see the upper bound of outcomes.
What does the speaker mean by 'hacking of variables'?
-The speaker refers to the manipulation of stochastic variables to achieve the upper bound of results, which can be misleading as the distribution of the maximum is different from the distribution of the variable itself.
What is the p-value and how is it calculated in the context of the speaker's explanation?
-The p-value is the probability of observing a statistic as extreme as, or more extreme than, the observed value, assuming the null hypothesis is true. It is calculated as the probability that the z-score (mean minus a hypothesized value divided by the standard deviation) is higher than a certain threshold.
Why does the speaker believe p-values are problematic?
-The speaker argues that p-values are problematic because they are stochastic and do not inherently account for the sample size (n), which can lead to misunderstandings about the significance of the results.
What is the issue with p-values being stochastic according to the speaker?
-The issue is that p-values themselves can vary and are skewed, with most values falling below their true mean, which can lead to incorrect conclusions about the significance of the results.
What is the 'survival function' mentioned by the speaker?
-The survival function is the probability that a random variable is greater than a certain value. In the context of p-values, it refers to the probability of exceeding a certain threshold under the assumption of a null hypothesis.
Why does the speaker suggest that the p-value should be considerably smaller than 1?
-The speaker suggests this because if the true p-value is significantly higher (e.g., 0.11), then achieving a p-value of 0.01 through multiple experiments is misleading and not representative of the actual probability.
What does the speaker imply about the use of p-values in psychology?
-The speaker implies that psychologists may prefer p-values because they allow for smaller sample sizes, which might be easier to manage in a college campus setting but can lead to flawed quantitative analysis.
What advice does the speaker give regarding the use of p-values?
-The speaker advises to be cautious with p-values and to consider alternative methods of analysis to avoid the pitfalls associated with their stochastic nature and potential for misinterpretation.