Null hypothesis Archives

What is statistics about?

Well, imagine you obtained some data from a particular collection of things. It could be the heights of individuals within a group of people, the weights of cats in a clowder, the number of petals in a bouquet of flowers, and so on.

Such collections are called samples and you can use the obtained data in two ways. The most straightforward thing you can do is give a detailed description of the sample. For example, you can calculate some of its useful properties:

The average of the sample
The spread of the sample (how much individual data points differ from each other), also known as its variance
The number or percentage of individuals who score above or below some constant (for example, the number of people whose height is above 180 cm)
Etc.

You only use these quantities to summarize the sample. And the discipline that deals with such calculations is descriptive statistics.

But what if you wanted to learn something more general than just the properties of the sample? What if you wanted to find a pattern that doesn’t just hold for this particular sample, but also for the population from which you took the sample? The branch of statistics that deals with such generalizations is inferential statistics and is the main focus of this post.

The two general “philosophies” in inferential statistics are frequentist inference and Bayesian inference. I’m going to highlight the main differences between them — in the types of questions they formulate, as well as in the way they go about answering them.

But first, let’s start with a brief introduction to inferential statistics.

[Read more…]

If you ever took an introductory course in statistics or attempted to read a publication in a scientific journal, you know what p-values are. Оr at least you’ve seen them. Most of the time they appear in the “results” section of a paper, attached to claims that need verification. For example:

“Ratings of the target person’s ‘dating desirability’ showed the predicted effect of prior stimuli, […], p < 002.”

The stuff in the square brackets is usually other relevant statistics, such as the mean difference between experimental groups. If the p-value is below a certain threshold, the result is labeled “statistically significant” and otherwise it’s labeled “not significant”. But what does that mean? What is the result significant for? And for whom? What does all of that say about the credibility of the claim preceded by the p-value?

There are common misinterpretations of p-values and the related concept of “statistical significance”. In this post, I’m going to properly define both concepts and show the intuition behind their correct interpretation.

If you don’t have much experience with probabilities, I suggest you take a look at the introductory sections of my post about Bayes’ theorem, where I also introduce some basic probability theory concepts and notation.

[Read more…]

Frequentist and Bayesian Approaches in Statistics

Intuitive Explanation of P-Values