Imagine you obtained some data from a particular collection of things. It could be the heights of individuals within a group of people, the weights of cats in a clowder, the number of petals in a bouquet of flowers, and so on.
Such collections are called samples and you can use the obtained data in two ways. The most straightforward thing you can do is give a detailed description of the sample. For example, you can calculate some of its useful properties:
- The average of the sample
- The spread of the sample (how much individual data points differ from each other), also known as its variance
- The number or percentage of individuals who score above or below some constant (for example, the number of people whose height is above 180 cm)
You only use these quantities to summarize the sample. And the discipline that deals with such calculations is descriptive statistics.
But what if you wanted to learn something more general than just the properties of the sample? What if you wanted to find a pattern that doesn’t just hold for this particular sample, but also for the population from which you took the sample? The branch of statistics that deals with such generalizations is inferential statistics and is the main focus of this post.
The two general “philosophies” in inferential statistics are frequentist inference and Bayesian inference. I’m going to highlight the main differences between them—in the types of questions they formulate, as well as in the way they go about answering them.
But first, let’s start with a brief introduction to inferential statistics.