If you ever took an introductory course in statistics or attempted to read a publication in a scientific journal, you know what p-values are. Оr at least you’ve seen them. Most of the time they appear in the “results” section of a paper, attached to claims that need verification. For example:
- “Ratings of the target person’s ‘dating desirability’ showed the predicted effect of prior stimuli, […], p < 002.”
The stuff in the square brackets is usually other relevant statistics, such as the mean difference between experimental groups. If the p-value is below a certain threshold, the result is labeled “statistically significant” and otherwise it’s labeled “not significant”. But what does that mean? What is the result significant for? And for whom? What does all of that say about the credibility of the claim preceded by the p-value?
There are common misinterpretations of p-values and the related concept of “statistical significance”. In this post, I’m going to properly define both concepts and show the intuition behind their correct interpretation.
If you don’t have much experience with probabilities, I suggest you take a look at the introductory sections of my post about Bayes’ theorem, where I also introduce some basic probability theory concepts and notation.