• Home
  • Announcements & Surveys
  • About
  • Contact

Probabilistic World

  • Probability Theory & Statistics
    • Fundamental Concepts
    • Measures
    • Probability Distributions
    • Bayes’ Theorem
  • General Math Topics
    • Number Theory
    • Algebra
    • Discrete Mathematics
      • Combinatorics
      • Cryptography & Cryptanalysis
  • Applications
Home Probability Theory & Statistics Probability Distributions Binomial Distribution Mean and Variance Formulas (Proof)

Binomial Distribution Mean and Variance Formulas (Proof)

Posted on May 19, 2020 Written by The Cthaeh 31 Comments

A skyscraper resembling a binomial distribution

This is a bonus post for my main post on the binomial distribution. Here I want to give a formal proof for the binomial distribution mean and variance formulas I previously showed you.

This post is part of my series on discrete probability distributions.

In the main post, I told you that these formulas are:

    \[ \textrm{Mean} = np \]

    \[ \textrm{Variance} = np (1-p) \]


For which I gave you an intuitive derivation. The intuition was related to the properties of the sum of independent random variables. Namely, their mean and variance is equal to the sum of the means/variances of the individual random variables that form the sum. We could prove this statement itself too but I don’t want to do that here and I’ll leave it for a future post.

Instead, I want to take the general formulas for the mean and variance of discrete probability distributions and derive the specific binomial distribution mean and variance formulas from the binomial probability mass function (PMF):

The probability mass function of a binomial distribution with input variable k and parameters p and n

To do that, I’m first going to derive a few auxiliary arithmetic properties and equations. We’re going to use those as pieces of the main proofs. But their usefulness is much bigger and you can apply them for many other derivations.

I think this post will be a great exercise for those of you who don’t have much experience in formal derivations of mathematical formulas. I’m going to be as explicit as I can and try to not skip even the smallest steps. So, don’t be scared by the quantity of equations in this post. I promise, you’ll be able to follow everything!

And if you happen to get stuck somewhere, I’m going to answer all of your questions in the comments. Don’t hesitate to ask me anything.

Table of Contents

  • Auxiliary properties and equations
    • A property of the binomial coefficient
  • Mean of binomial distributions derivation
    • Proof steps
    • The final proof
  • Variance of binomial distributions derivation
    • Proof steps
    • The final proof
  • Summary
    • Mean of binomial distributions proof
    • Variance of binomial distributions proof

Auxiliary properties and equations

To make it easy to refer to them later, I’m going to label the important properties and equations with numbers, starting from 1. These identities are all we need to prove the binomial distribution mean and variance formulas. The derivations I’m going to show you also generally rely on arithmetic properties and, if you’re not too experienced with those, you might benefit from going over my post breaking down the main ones.

The first two equations are two important identities involving the sum operator which I proved in my recent post on the topic:

(1)   \begin{equation*}\sum_{k}^{n} c \cdot x_k = c \cdot \sum_{k}^{n} x_k\end{equation*}


(2)   \begin{equation*}\sum_{k}^{n} (x_k + y_k) = \sum_{k}^{n} x_k + \sum_{k}^{n} y_k\end{equation*}


Second, in another recent post on different variance formulas I showed you the following alternative variance formula for a random variable X with mean M:

(3)   \begin{equation*}\textrm{Variance(X)} = \mathop{\mathbb{E}[X^2] - M^2\end{equation*}


It’s a pretty nice formula used in many derivations, not just the ones I’m about to show you (for more intuition, check out the link above). According to this formula, the variance can also be expressed as the expected value of X^2 minus the square of its mean. As a reminder (and for comparison), here’s the main variance formula:

    \[\textrm{Variance(X)} = \sum_{n=1}^{N} (M - x_n)^2 \cdot P(x_n)\]

A property of the binomial coefficient

Finally, I want to show you a simple property of the binomial coefficient which we’re going to use in proving both formulas. Remember the binomial coefficient formula:

    \[ \binom{n}{k} = \frac{n!}{(n-k)!k!} \]


The first useful result I want to derive is for the expression \binom{n-1}{k-1}. Let’s apply the formula to this expression and simplify:

    \[ \binom{n-1}{k-1} = \frac{(n-1)!}{((n-1) - (k-1))!(k-1)!} \]

    \[ = \frac{(n-1)!}{(n-k)!(k-1)!} \]


Therefore:

    \[\binom{n-1}{k-1} = \frac{(n-1)!}{(n-k)!(k-1)!}\]



Now let’s do something else. One of the simplest properties of the factorial function is:

    \[ x! = x \cdot (x - 1)! \]


I want to use this to derive the main property of binomial coefficients we’re interested in here. First, let’s use it to rewrite the right-hand side in the following way:

    \[ \binom{n}{k} = \frac{n!}{(n-k)!k!} = \frac{n(n-1)!}{(n-k)!k(k-1)!} \]


And using the commutative property of multiplication (x \cdot y = y \cdot x), we can rewrite the right-hand side as:

    \[ \frac{n}{k} \cdot \frac{(n-1)!}{(n-k)!(k-1)!} \]


Now let’s say we start with another expression: k \cdot \binom{n}{k}. Using the result above, we can equate it to:

    \[ k \cdot \binom{n}{k} = k \cdot \frac{n}{k} \cdot \frac{(n-1)!}{(n-k)!(k-1)!} \]

    \[ = n \cdot \frac{(n-1)!}{(n-k)!(k-1)!} \]


In the last step, I simply canceled out the two k’s. Finally, using \binom{n-1}{k-1} = \frac{(n-1)!}{(n-k)!(k-1)!} (derived above), we get the following identity:

(4)   \begin{equation*}k \cdot \binom{n}{k} = n \cdot \binom{n-1}{k-1}\end{equation*}


And with all that out of the way, let’s get to the main proofs of today’s post!

Mean of binomial distributions derivation

Well, here we reach the main point of this post! Let’s use these equations and properties to derive the formulas we’re interested in.

First is the mean. Here’s the general formula again:

The formula for the statistical measure of central tendency called mean for discrete probability distributions

Let’s plug in the binomial distribution PMF into this formula. To be consistent with the binomial distribution notation, I’m going to use k for the argument (instead of x) and the index for the sum will naturally range from 0 to n. So, with M(K) = \mathop{\mathbb{E}[K] in mind, we have:

    \[ \mathop{\mathbb{E}[K] = \sum_{k=0}^{n} k \cdot \binom{n}{k} p^k(1-p)^{n-k} \]


But notice that when k = 0 the first term is also zero and doesn’t contribute to the overall sum. Therefore, we can also write the formula by having the index start from k = 1:

    \[ \mathop{\mathbb{E}[K] = \sum_{k=1}^{n} k \cdot \binom{n}{k} p^k(1-p)^{n-k} \]


Now let’s see how we can manipulate the right-hand side to get the desired \mathop{\mathbb{E}[K] = np. Let’s do the proof step by step.

Proof steps

The first step in the derivation is to apply the binomial property k \cdot \binom{n}{k} = n \cdot \binom{n-1}{k-1} from equation (4) to the right-hand side:

    \[ \mathop{\mathbb{E}[K] = \sum_{k=1}^{n} n \cdot \binom{n-1}{k-1} p^k(1-p)^{n-k} \]

    \[ = n \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} p^k(1-p)^{n-k} \]


In the second line, I simply used equation (1) to get n out of the sum operator (because it doesn’t depend on k).

Next, we’re going to use the product rule of exponents:

    \[ x^n \cdot x^m = x^{n+m} \]


A special case of this rule is:

    \[ x^n = x \cdot x^{n-1} \]



And we’re going to use it to rewrite p^k as p \cdot p^{k-1}:

    \[ \mathop{\mathbb{E}[K] = n \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} pp^{k-1}(1-p)^{n-k} \]

    \[ = np \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} p^{k-1}(1-p)^{n-k} \]


Again, in the last line I simply took out the constant term p outside of the sum operator. Finally, let’s apply the identity n - k = (n - 1) - (k - 1) to the exponent of (1 – p) (you’ll see why we do this in a moment):

    \[ \mathop{\mathbb{E}[K] = np \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} p^{k-1}(1-p)^{(n - 1) - (k - 1)} \]

The final proof

Believe it or not, we’re almost done here. Notice that, after the last manipulation, there’s a lot of terms like n – 1 and k – 1 inside the sum operator. To make the expression a little more readable, let’s rewrite it by applying the following variable substitutions:

    \[ n - 1 = m \]

    \[ k - 1 = j \]


This results in:

    \[ \mathop{\mathbb{E}[K] = np \cdot \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} \]


Here j starts from 0 because j = k – 1 (the k index used to start from 1 before the variable substitution). And because the number of terms in the sum must be preserved, the index runs until n – 1 = m.

Now, do you recognize the term inside the sum operator?

    \[ \binom{m}{j} p^j(1-p)^{m-j} \]


It looks exactly like the binomial PMF, doesn’t it? Only k has been replaced with j and n with m. And since the sum is from 0 to m, this is simply the sum of probabilities of all outcomes, right? Then, by definition:

    \[ \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} = 1 \]


And plugging this last result into what we have so far, we get:

    \[ \mathop{\mathbb{E}[K] = np \cdot \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} \]

    \[ = np \cdot 1 = np \]


Therefore, we can now confidently state:

    \[ \mathop{\mathbb{E}[K] = np \]


Q.E.D.

Variance of binomial distributions derivation

Now it’s time to prove the variance formula. Remember the general variance formula for discrete probability distributions:

The formula for the statistical measure of dispersion called variance for discrete probability distributions

Like before, for the argument of the PMF I’m going to use k, instead of x. Furthermore, I’m going to use the alternative variance formula from equation (3) we derived earlier:

    \[ \textrm{Variance(K)} = \mathop{\mathbb{E}[K^2] - \mathop{\mathbb{E}[K]^2 \]


Using the result from the previous section:

    \[ \mathop{\mathbb{E}[K] = np \]


And we can plug in \mathop{\mathbb{E}[K]^2 = (np)^2 to get:

    \[ \textrm{Variance(K)} = \mathop{\mathbb{E}[K^2] - (np)^2 \]


We already solved half of the problem! Now let’s focus on the \mathop{\mathbb{E}[K^2] term. Using the binomial PMF, this expected value is equal to:

    \[ \mathop{\mathbb{E}[K^2] = \sum_{k=0}^{n} k^2 \cdot \binom{n}{k} p^k(1-p)^{n-k} \]


Like before, when k = 0 the first term in the sum becomes zero again. So we can similarly write the same sum with the index starting from 1:

    \[ \mathop{\mathbb{E}[K^2] = \sum_{k=1}^{n} k^2 \cdot \binom{n}{k} p^k(1-p)^{n-k} \]


With this setup, let’s start with the actual proof. The focus is going to be on manipulating the last equation. When we’re done with that, we’re going to plug in the final result into the main formula.

You’ll see that the mathematical tricks we use are going to be very similar to the ones we used in the previous proof.

Proof steps

First, using equation (4), let’s rewrite the k^2 \cdot \binom{n}{k} part inside the sum as:

    \[ k^2 \cdot \binom{n}{k} = k \cdot k \cdot \binom{n}{k} = k \cdot n \cdot \binom{n-1}{k-1} \]


Plugging this in and taking the constant term n out of the sum operator, we get:

    \[ \mathop{\mathbb{E}[K^2] = n \cdot \sum_{k=1}^{n} k \cdot \binom{n-1}{k-1} p^k(1-p)^{n-k} \]


Next, let’s use the p^k = pp^{k - 1} identity to rewrite this as:

    \[ \mathop{\mathbb{E}[K^2] = np \cdot \sum_{k=1}^{n} k \cdot \binom{n-1}{k-1} p^{k-1}(1-p)^{n-k} \]

    \[ = np \cdot \sum_{k=1}^{n} k \cdot \binom{n-1}{k-1} p^{k-1}(1-p)^{(n-1) - (k-1)} \]

Simplifying the sum

Now let’s ignore the constant product np for a moment and just focus on the sum. Let’s apply the same variable substitution rules as before:

    \[ n - 1 = m \]

    \[ k - 1 = j \]


to rewrite the sum as:

    \[ \sum_{k=1}^{n} k \cdot \binom{n-1}{k-1} p^{k-1}(1-p)^{(n-1) - (k-1)} = \]

    \[ \sum_{j=0}^{m} (j + 1) \cdot \binom{m}{j} p^j(1-p)^{m-j} \]


Next, let’s use equation (2) to split this sum into two sums by expanding (j + 1) \binom{m}{j} p^j(1-p)^{m-j} with the distributive property:

    \[ = \sum_{j=0}^{m} j \cdot \binom{m}{j} p^j(1-p)^{m-j} + \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} \]


Well, these individual sums are nothing but the expected value and the sum of probabilities of a binomial distribution:

    \[ \sum_{j=0}^{m} j \cdot \binom{m}{j} p^j(1-p)^{m-j} = mp = (n-1)p \]

    \[ \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} = 1 \]


Therefore, the final sum reduces to:

    \[ \sum_{j=0}^{m} (j + 1) \cdot \binom{m}{j} p^j(1-p)^{m-j} = (n-1)p + 1 \]


And when we plug this into the full expression for \mathop{\mathbb{E}[K^2] we get:

    \[ \mathop{\mathbb{E}[K^2] = np \cdot \sum_{k=1}^{n} k \cdot \binom{n-1}{k-1} p^{k-1}(1-p)^{n-k} \]

    \[ = np \cdot \left((n-1)p + 1\right) = (np)^2 - np^2 + np \]


Which we can rewrite as:

    \[ (np)^2 - np^2 + np = (np)^2 + np(1-p) \]


So, finally we get:

    \[ \mathop{\mathbb{E}[K^2] = (np)^2 + np(1-p)  \]

The final proof

We’re at the homestretch. Let’s remember what we started with.

    \[ \textrm{Variance(K)} = \mathop{\mathbb{E}[K^2] - \mathop{\mathbb{E}[K]^2 = \mathop{\mathbb{E}[K^2] - (np)^2 \]


In the previous section we established that:

    \[ \mathop{\mathbb{E}[K^2] = (np)^2 + np(1-p) \]


Therefore:

    \[ \textrm{Variance(K)} = (np)^2 + np(1-p) - (np)^2 \]

    \[ = np(1-p) \]


Which is what we wanted to prove!

Q.E.D.

Summary

In this post, I showed you a formal derivation of the binomial distribution mean and variance formulas. This is the first formal proof I’ve ever done on my website and I’m curious if you found it useful. Let me know if it was easy to follow.

Before the actual proofs, I showed a few auxiliary properties and equations.

The two properties of the sum operator (equations (1) and (2)):

    \[ \sum_{k}^{n} c \cdot x_k = c \cdot \sum_{k}^{n} x_k \]

    \[ \sum_{k}^{n} (x_k + y_k) = \sum_{k}^{n} x_k + \sum_{k}^{n} y_k \]


An alternative formula for the variance of a random variable (equation (3)):

    \[ \textrm{Variance(X)} = \mathop{\mathbb{E}[X^2] - M^2 = \mathop{\mathbb{E}[X^2] - \mathop{\mathbb{E}[X]^2 \]


The binomial coefficient property (equation (4)):

    \[ k \cdot \binom{n}{k} = n \cdot \binom{n-1}{k-1} \]

Using these identities, as well as a few simple mathematical tricks, we derived the binomial distribution mean and variance formulas. In the last two sections below, I’m going to give a summary of these derivations.

I know there was a lot of mathematical expression manipulation, some of which was a little bit on the hairy side. However, I’m firmly convinced that even less experienced readers can understand these proofs. If you struggled to follow any part of this post (or even the post as a whole), don’t hesitate to ask me any question!

By the way, if you’re new to mathematical proofs but find it an interesting subject, check out this Wikipedia article on mathematical proofs which gives a good overview of the subject.

Mean of binomial distributions proof

We start by plugging in the binomial PMF into the general formula for the mean of a discrete probability distribution:

    \[ \textrm{Mean(K)} = \mathop{\mathbb{E}[K] \]

    \[ = \sum_{k=0}^{n} k \cdot \binom{n}{k} p^k(1-p)^{n-k} = \sum_{k=1}^{n} k \cdot \binom{n}{k} p^k(1-p)^{n-k} \]


Then we use n \cdot \binom{n-1}{k-1} and p^k = pp^{k-1} to rewrite it as:

    \[ = n \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} p^k(1-p)^{n-k} \]

    \[ = np \cdot \sum_{k=1}^{n} \binom{n-1}{k-1} p^{k-1}(1-p)^{(n - 1) - (k - 1)} \]


Finally, we use the variable substitutions m = n – 1 and j = k – 1 and simplify:

    \[ = np \cdot \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} \]

    \[ = np \cdot 1 = np \]


Q.E.D.

Variance of binomial distributions proof

Again, we start by plugging in the binomial PMF into the general formula for the variance of a discrete probability distribution:

    \[ \textrm{Variance(K)} = \mathop{\mathbb{E}[K^2] - \mathop{\mathbb{E}[K]^2 \]

    \[ = \sum_{k=0}^{n} \left( k^2 \cdot \binom{n}{k} p^k(1-p)^{n-k} \right) - (np)^2 = \sum_{k=1}^{n} \left( k^2 \cdot \binom{n}{k} p^k(1-p)^{n-k} \right) - (np)^2 \]


Then we use n \cdot \binom{n-1}{k-1} and p^k = pp^{k-1} to rewrite it as:

    \[ = n \cdot \sum_{k=1}^{n} \left( k \cdot \binom{n-1}{k-1} p^k(1-p)^{n-k} \right) - (np)^2 \]

    \[ = np \cdot \sum_{k=1}^{n} \left( k \cdot \binom{n-1}{k-1} p^{k-1}(1-p)^{(n - 1) - (k - 1)} \right) - (np)^2 \]


Next, we use the variable substitutions m = n – 1 and j = k – 1:

    \[ = np \cdot \sum_{j=0}^{m} \left( (j + 1) \cdot \binom{m}{j} p^j(1-p)^{m-j} \right) - (np)^2 \]

    \[ = np \cdot \sum_{j=0}^{m} j \cdot \binom{m}{j} p^j(1-p)^{m-j} + np \cdot \sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j} - (np)^2 \]


Finally, we simplify:

    \[ = np \cdot p(n-1) + np - (np)^2 \]

    \[ = (np)^2 + np(1-p) - (np)^2 = np(1-p) \]


Q.E.D.

Filed Under: Algebra, Probability Distributions Tagged With: Expected value, Mean, Probability mass, Variance

Comments

  1. Sergey says

    June 4, 2020 at 1:21 pm

    Great job! I definitely found the proofs useful and easy to follow. The refresher on the various mathematical properties was a nice reminder (it’s been a while…). Thanks!

    Reply
  2. Durjoy says

    December 20, 2020 at 8:22 pm

    Thank you.

    Reply
  3. Swami says

    January 23, 2021 at 9:17 am

    Excellent proof…Simple to understand

    Reply
  4. Andrew kingsley says

    February 20, 2021 at 7:18 pm

    I find it difficult to understand some of the manipulations

    Reply
    • The Cthaeh says

      February 21, 2021 at 3:28 am

      Hey Andrew, until which part do you follow everything before you reach the first difficulty? Which step(s) in particular confuse you? Let me know and I’ll help you get through it.

  5. deinma says

    May 17, 2021 at 1:41 pm

    thank you so much, it was thoroughly explained, what I don’t understand is why m replaced n and j replaced k

    Reply
    • The Cthaeh says

      May 18, 2021 at 2:55 am

      Hi Deinma, I’m glad you found the explanation helpful! The change of variables is really just for readability purposes. If a computer was doing the same proof, this step might be unnecessary.

      Take a look at the final proof section for the mean, for example. In particular, notice the last expression for the mean formula just before that section. There are all these (n-1) and (k-1) terms inside the sum, right?

          \[\sum_{k=1}^{n} \binom{n-1}{k-1} p^{k-1}(1-p)^{(n - 1) - (k - 1)}\]

      Now, do you recognize that this sum is the binomial distribution PMF formula? I personally might not immediately recognize it if I see it in this form. On the other hand, if we apply the variable substitution j = k-1 and m = n-1, the same sum becomes:

          \[\sum_{j=0}^{m} \binom{m}{j} p^j(1-p)^{m-j}\]

      Which is an identical expression, but much easier to read. And now that we recognize it as the sum over all possible values of a PMF, we know the whole sum is equal to 1 and we can completely ignore it. In principle, I could have used the same letters (k and n) but that would introduce a different type of confusion when comparing the current expression to previous steps.

      Variable substitution (also known as ‘change of variables’) is a very common technique in algebra and math in general. It’s typically used for simplifying expressions or putting expressions into a familiar form where we can then substitute the whole thing with a known formula (as was the case in the proofs here).

      Let me know if I managed to clarify things for you.

    • Symon says

      June 16, 2022 at 12:09 am

      Hi,

      Can you explain why the upper index of summation should not be m+1 since
      m=n-1. Note that the original summation had n as the upper index, hence
      n=m+1 if we go by the substitution that you just enforced.

      Otherwise, everything else is very insightful!

      I like your proofs!

      Symon

    • The Cthaeh says

      June 18, 2022 at 4:07 pm

      Hi Symon,

      This is among the confusing parts of the post, it’s not just you. Could you take a look at my reply to Yuki’s question further down the page? I think you are asking the same thing. Of course, let me know if you still find something unclear, I’ll be happy to clarify.

  6. AZ says

    June 3, 2021 at 10:48 am

    Thanks

    Reply
  7. mehrad says

    June 17, 2021 at 12:26 pm

    thank you

    Reply
  8. Yan Jiang says

    August 24, 2021 at 10:26 am

    Small error: the second last step should be
    np(n-1)p+np-(np)^2
    Rather than
    npn(p-1)+np-(np)^2

    Reply
    • The Cthaeh says

      August 24, 2021 at 1:59 pm

      Oh yeah, another annoying typo 🙂 Thanks for pointing it out, Yan. Fixed!

  9. Emilio Cendejas says

    October 8, 2021 at 3:55 am

    Hi Andrew, thank you for this.

    Right before reaching the end, when you write “Finally, we simplify:”, what do you do to the binomial coefficient? The same exact process you did before changing the variables? I mean, like now z = y- 1 and r = m – 1? And because of that, we know it sums up to 1?

    Thanks again.

    Reply
    • The Cthaeh says

      October 8, 2021 at 7:34 am

      Hey Emilio,

      My name isn’t Andrew, I’m assuming you confused it with one of the earlier commenters 🙂 No worries.

      Yes, the change of variables is simply meant to put the sums into the canonical forms for the respective formulas (the PMF, the mean, etc.). It’s a step that people sometimes skip because they just spot the pattern and do the simplification in their heads.

      Let me know if you need further clarification.

  10. Failo Taufa says

    October 21, 2021 at 1:11 am

    Very nice derivation and easy to follow. Thanks for the good work.

    Reply
  11. Umi says

    October 28, 2021 at 4:48 pm

    When you derive E[k^2], why don’t all k become k^2?

    Reply
    • The Cthaeh says

      November 1, 2021 at 2:19 pm

      Hi Umi, I’m not sure I understand your question. Could you show the exact step at which you stop following?

    • Conor says

      September 17, 2024 at 10:33 pm

      Hi Umi,

      I’ll follow up on this one as I am somewhat unclear on this also. If we consider the expected value of a BPD to be:

      E[x] = \sum{x*P(X = x)}

      Then why is k^2 not substituted into the probability function?

      For example, the proof shows:

      E[k^2] = \sum{k^2*P(X = k)}

      whereas I was expecting:

      x = k^2 \therefore E[k^2] = (k^2) * P(X = (k^2))

      Does that make any sense where my confusion is coming from?

  12. Dr G N Dar says

    November 7, 2021 at 4:57 am

    i did by other way, my confusion was for example in mean after substitution it is ok y starts from 0 but n must have been m-1 through the formula (x-1)=y but issue stands resolved with your comment that number of terms have to remain same.

    Reply
  13. Yuki says

    November 10, 2021 at 6:35 pm

    In the proof of distribution mean, you let n-1 = m. Then, the end of the summation should be m+1 instead of m. So, I don’t get it.

    Reply
    • The Cthaeh says

      November 11, 2021 at 7:27 am

      Hi, Yuki. You’re trying to understand why the upper bound of the sum is m, instead of (m+1), correct?

      Notice that after we made the variable substitution n – 1 = m, the lower bound of the sum also changes (from 1 to 0). Basically, the idea is to have an identical sum. Why is the sum running from j=0 to m the same as a sum running from k=1 to n? To see it more clearly, take a specific value for n. For example, n=3.

      In the original sum, since k starts at 1 and ends at 3, the k’s in the expression at each iteration will be 1, 2, and 3. Which means that the (k-1) terms in the original sum will be 0, 1, and 2, respectively. Are you with me so far?

      Then, when we apply the variable substitution, the sum runs from 0 to m. Because m = n – 1, we know the j’s will run from 0 to 2, right? And inside the sum, everywhere we had (k-1), we now have j instead. So, just like they were in the original sum, the respective values inside the sum term will be 0, 1, and 2. Furthermore, everywhere we had (n-1) in the original expression, now we have m, which is equal to n – 1. Therefore, we know the two expressions are identical, one is just a little tidier.

      On the other hand, if we let the upper bound be m+1 instead, then we would have an extra term for j=3, which we didn’t have in the original sum.

      Bottom line is, if we decrease the lower bound by 1, we also need to decrease the upper bound by 1, otherwise we will have an extra term (and, therefore, the two sum expressions will be different).

      Does that make sense?

    • Symon says

      June 16, 2022 at 12:10 am

      I have the same question.

      Thank you!

  14. Pedson says

    March 7, 2022 at 6:36 pm

    i understood it very well
    thank you

    Reply
  15. Ray says

    September 8, 2022 at 9:30 am

    Hey. This was perfect. Thanks.

    Reply
  16. Shubham says

    October 5, 2022 at 3:46 pm

    Clear and thoroughly explained. The derivation in my book was very ambiguous and I could not clearly understand it. But your derivation really cleared my concepts.
    Thanks.

    Reply
  17. Shuhul Mujoo says

    November 17, 2022 at 9:24 pm

    Thank you so much for this! It really helped me a lot

    Reply
  18. Aphrodice says

    January 28, 2023 at 10:15 am

    I love how you combine different formulas to prove something!

    Reply
  19. PKP says

    August 9, 2023 at 12:01 pm

    Wonderful proof!
    Initially, I too got stumbled on m=n-1!
    Thanks!
    PKP

    Reply
  20. SG says

    February 24, 2024 at 3:03 pm

    Very helpful, thank you.

    Reply
  21. xuqi says

    September 4, 2024 at 2:44 am

    Thanks for the detailed proof. I don’t get why would the sum of j from j = 0 to j = m equal to j tho.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Sign Up For The Probabilistic World Newsletter

Enter your email below to receive updates and be notified about new posts.

Follow Probabilistic World

  • Facebook
  • Twitter
  • YouTube

Recent posts

  • Numeral Systems: Everything You Need to Know
  • Introduction to Number Theory: The Basic Concepts
  • Cryptography During World War I
  • Mean and Variance of Discrete Uniform Distributions
  • Euclidean Division: Integer Division with Remainders

Probabilistic World