Picking Two Types of Binomial Trials

We motivate the discussion with the following example. The notation W \sim \text{binom}(n,p) denotes the statement that W has a binomial distribution with parameters n and p. In other words, W is the number of successes in a sequence of n independent Bernoulli trials where p is the probability of success in each trial.

Example 1
Suppose that a student took two multiple choice quizzes in a course for probability and statistics. Each quiz has 5 questions. Each question has 4 choices and only one of the choices is correct. Suppose that the student answered all the questions by pure guessing. Furthermore, the two quizzes are independent (i.e. results of one quiz will not affect the results of the other quiz). Let X be the number of correct answers in the first quiz and Y be the number of correct answers in the second quiz. Suppose the student was told by the instructor that she had a total of 4 correct answers in these two quizzes. What is the probability that she had 3 correct answers in the first quiz?

On the face of it, the example is all about binomial distribution. Both X and Y are binomial distributions (both \sim \text{binom}(5,\frac{1}{4})). The sum X+Y is also a binomial distribution (\sim \text{binom}(10,\frac{1}{4})). The question that is being asked is a conditional probability, i.e., P(X=3 \lvert X+Y=4). Surprisingly, this conditional probability can be computed using the hypergeometric distribution. One can always work this problem from first principle using binomial distributions. As discussed below, for a problem such as Example 1, it is always possible to replace the binomial distributions using a thought process involving the hypergeometric distribution.

Here’s how to think about the problem. This student took the two quizzes and was given the news by the instructor that she had 4 correct answers in total. She now wonders what the probability of having 3 correct answers in the first quiz is. The thought process is this. She is to pick 4 questions from 10 questions (5 of them are from Quiz 1 and 5 of them are from Quiz 2). So she is picking 4 objects from a group of two distinct types of objects. This is akin to reaching into a jar that has 5 red balls and 5 blue balls and pick 4 balls without replacement. What is the probability of picking 3 red balls and 1 blue ball? The probability just described is from a hypergeometric distribution. The following shows the calculation.

\displaystyle (1) \ \ \ \ P(X=3 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{3} \ \binom{5}{1}}{\displaystyle \binom{10}{4}}=\frac{50}{210}

We will show below why this works. Before we do that, let’s describe the above thought process. Whenever you have two independent binomial distributions X and Y with the same probability of success p (the number of trials does not have to be the same), the conditional distribution X \lvert X+Y=a is a hypergeometric distribution. Interestingly, the probability of success p has no bearing on this observation. For Example 1, we have the following calculation.

\displaystyle (2a) \ \ \ \ P(X=0 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{0} \ \binom{5}{4}}{\displaystyle \binom{10}{4}}=\frac{5}{210}

\displaystyle (2b) \ \ \ \ P(X=1 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{1} \ \binom{5}{3}}{\displaystyle \binom{10}{4}}=\frac{50}{210}

\displaystyle (2c) \ \ \ \ P(X=2 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{2} \ \binom{5}{2}}{\displaystyle \binom{10}{4}}=\frac{100}{210}

\displaystyle (2d) \ \ \ \ P(X=3 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{3} \ \binom{5}{1}}{\displaystyle \binom{10}{4}}=\frac{50}{210}

\displaystyle (2e) \ \ \ \ P(X=4 \lvert X+Y=4)=\frac{\displaystyle \binom{5}{4} \ \binom{5}{0}}{\displaystyle \binom{10}{4}}=\frac{5}{210}

Interestingly, the conditional mean E(X \lvert X+Y=4)=2, while the unconditional mean E(X)=5 \times \frac{1}{4}=1.25. The fact that the conditional mean is higher is not surprising. The student was lucky enough to have obtained 4 correct answers by guessing. Given this, she had a greater chance of doing better on the first quiz.

Why This Works

Suppose X \sim \text{binom}(5,p) and Y \sim \text{binom}(5,p) and they are independent. The joint distribution of X and Y has 36 points in the sample space. See the following diagram.

Figure 1

The probability attached to each point is

\displaystyle \begin{aligned}(3) \ \ \ \  P(X=x,Y=y)&=P(X=x) \times P(Y=y) \\&=\binom{5}{x} p^x (1-p)^{5-x} \times \binom{5}{y} p^y (1-p)^{5-y}  \end{aligned}

where x=0,1,2,3,4,5 and y=0,1,2,3,4,5.

The conditional probability P(X=k \lvert X+Y=4) involves 5 points as indicated in the following diagram.

Figure 2

The conditional probability P(X=k \lvert X+Y=4) is simply the probability of one of the 5 sample points as a fraction of the sum total of the 5 sample points encircled in the above diagram. The following is the sum total of the probabilities of the 5 points indicated in Figure 2.

\displaystyle \begin{aligned}(4) \ \ \ \  P(X+Y=4)&=P(X=0) \times P(Y=4)+P(X=1) \times P(Y=3)\\&\ \ \ \ +P(X=2) \times P(Y=3)+P(X=3) \times P(Y=2)\\&\ \ \ \ +P(X=4) \times P(Y=0)  \end{aligned}

We can plug (3) into (4) and work out the calculation. But (4) is actually equivalent to the following because X+Y \sim \text{binom}(10,p).

\displaystyle (5) \ \ \ \ P(X+Y=4)=\ \binom{10}{4} p^4 \ (1-p)^{6}

As stated earlier, the conditional probability P(X=k \lvert X+Y=4) is simply the probability of one of the 5 sample points as a fraction of the sum total of the 5 sample points encircled in Figure 2. Thus we have:

\displaystyle \begin{aligned}(6) \ \ \ \  P(X=k \lvert X+Y=4)&=\frac{P(X=k) \times P(Y=4-k)}{P(X+Y=4)} \\&=\frac{\displaystyle \binom{5}{k} p^k (1-p)^{5-k} \times \binom{5}{4-k} p^{4-k} (1-p)^{5-(4-k)}}{\displaystyle  \binom{10}{4} p^4 \ (1-p)^{6}}   \end{aligned}

With the terms involving p and 1-p cancel out, we have:

\displaystyle (7) \ \ \ \  P(X=k \lvert X+Y=4)=\frac{\displaystyle \binom{5}{k} \times \binom{5}{4-k}}{\displaystyle  \binom{10}{4}}


Suppose X \sim \text{binom}(N,p) and Y \sim \text{binom}(M,p) and they are independent. Then X+Y is also a binomial distribution, i.e., \sim \text{binom}(N+M,p). Suppose that both binomial experiments \text{binom}(N,p) and \text{binom}(M,p) have been performed and it is known that there are a successes in total. Then X \lvert X+Y=a has a hypergeometric distribution.

\displaystyle (8) \ \ \ \  P(X=k \lvert X+Y=a)=\frac{\displaystyle \binom{N}{k} \times \binom{M}{a-k}}{\displaystyle  \binom{N+M}{a}}

where k=0,1,2,3,\cdots,\text{min}(N,a).

As discussed earlier, think of the N trials in \text{binom}(N,p) as red balls and think of the M trials in \text{binom}(M,p) as blue balls in a jar. Think of the a successes as the number of balls you are about to draw from the jar. So you reach into the jar and select a balls without replacement. The calculation in (8) gives the probability that you select k red balls and a-k blue balls.

The probability of success p in the two binomial distributions have no bearing on the result since it gets canceled out in the derivation. One can always work a problem like Example 1 using first principle. Once the thought process using hypergeometric distribution is understood, it is a great way to solve this problem, that is, you can by pass the binomial distributions and go straight to the hypergeometric distribution.

Additional Practice
Practice problems are found in the following blog post.

How to pick binomial trials


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s