A lazy professor who lets students do their own grading

After reading the example discussed here, any professor or instructor who is contemplating randomly distributing quizzes back to the students for grading perhaps should reconsider the idea! This is one of several posts discussing the matching problem. Go to the end of this post to find links to these previous posts.

Consider the following problem. Suppose that a certain professor is lazy and lets his students grade their own quizzes. After he collects the quizzess from his n students, he randomly assigns the quizzes back to these n students for grading. If a student is assigned his or her own quiz, we say that it is a match. We have the following questions:

  • What is the probability that each of the n students is a match?
  • What is the probability that none of the n students is a match?
  • What is the probability that exactly k of the n students are matches?
  • What is the probability that at least one of the n students is a match?

The above problem is called the matching problem, which is a classic problem in probability. In this post we solve the last question indicated above. Though the answer is in terms of the total number of quizzes n, it turns out that the answer is independent of n and is approximately \frac{2}{3}. Thus if the professor assigns the quizzes randomly, it will be very unusual that there is no match.

The last question above is usually stated in other matching situations. One is that there are n couples (say, each couple consists of a husband and a wife) in a class for ballroom dancing. Suppose that the dance instructor randomly matches the men to the ladies. When a husband is assigned his own wife, we say that it is a match. What is the probability that there is at least one couple that is a match?

The key to answering this question is the theorem stated in Feller (page 99 in chapter 4 of [1]). We state the theorem and make use of it in the solution of the last question above. A sketch of the proof will be given at the end. For ideas on the solutions to the first three questions above, see this previous post.

The union of n events
For any n events E_1,E_2, \cdots,E_n that are defined on the same sample space, we have the following formula:

(1) \ \ \ \ \ P[E_1 \cup E_2 \cup \cdots \cup E_n]=\sum \limits_{m=1}^{n} (-1)^{m+1} \thinspace S_m where

S_1=\sum \limits_{r=1}^{n}P[E_r],

S_2=\sum \limits_{j<k}P[E_j \cap E_k],

S_m=\sum P[E_{i(1)} \cap E_{i(2)} \cap \cdots \cap E_{i(m)}]

Note that in the general term S_m, the sum is taken over all increasing sequence i(\cdot), i.e. 1 \le i(1) < i(2) < \cdots < i(m) \le n. For n=2,3, we have the following familiar formulas:

\displaystyle P[E_1 \cup E_2]=P[E_1]+P[E_2]-P[E_1 \cap E_2]

\displaystyle \begin{aligned}P[E_1 \cup E_2 \cup E_3]=& \ \ \ \ P[E_1]+P[E_2]+P[E_3]\\&-P[E_1 \cap E_2]-P[E_1 \cap E_3]-P[E_2 \cap E_3]\\&+P[E_1 \cap E_2 \cap E_3] \end{aligned}

The Matching Problem

Suppose that the n students are labeled 1,2, \cdots, n. Let E_i be the even that the i^{th} student is assigned his or her own quiz by the professor. Then P[E_1 \cup E_2 \cup \cdots \cup E_n] is the probability that there is at least one correct match.

Note that P[E_i]=\frac{(n-1)!}{n!}=\frac{1}{n}. This is the case since we let the i^{th} student be fixed and we permute the other n-1 students. Likewise, P[E_i \cap E_j]=\frac{(n-2)!}{n!}, since we fix the i^{th} and j^{th} students and let the other (n-2)! students permute. In general, whenever i(1),i(2),\cdots,i(m) are m distinct integers and are increasing, we have:

\displaystyle P[E_{i(1)} \cap E_{i(2)} \cdots \cap E_{i(m)}]=\frac{(n-m)!}{n!}

We now apply the formula (1). First we show that for each m where 1 \le m \le n, S_m=\frac{1}{m!}. Since there are \binom{n}{m} many ways to have m matches out of n students, we have:

\displaystyle \begin{aligned}S_m&=\sum P[E_{i(1)} \cap E_{i(2)} \cdots \cap E_{i(m)}]\\&=\binom{n}{m} \frac{(n-m)!}{n!}\\&=\frac{1}{m!} \end{aligned}

Applying the formula for the union of n events, we have:

\displaystyle P[E_1 \cup E_2 \cup \cdots \cup E_n]=1-\frac{1}{2!}+\frac{1}{3!}-\cdots+(-1)^{n+1}\frac{1}{n!}

\displaystyle 1-P[E_1 \cup E_2 \cup \cdots \cup E_n]=1-1+\frac{1}{2!}-\frac{1}{3!}+\cdots+(-1)^{n}\frac{1}{n!}

Note that the left-hand side of the above equality is the first n+1 terms in the expansion of e^{-1}. Thus we have:

\displaystyle \begin{aligned}\lim_{n \rightarrow \infty}P[E_1 \cup E_2 \cup \cdots \cup E_n]&=\lim_{n \rightarrow \infty}\biggl(1-\frac{1}{2!}+\frac{1}{3!}-\cdots+(-1)^{n+1}\frac{1}{n!}\biggr)\\&=1-e^{-1}\\&=0.6321205588 \end{aligned}

The above limit converges quite rapidly. Let P_n=P[E_1 \cup E_2 \cup \cdots \cup E_n]. The following table lists out the first several terms of this limit.

\displaystyle \begin{pmatrix} n&\text{   }&P_n \\{2}&\text{   }&0.50000 \\{3}&\text{   }&0.66667 \\{4}&\text{   }&0.62500 \\{5}&\text{   }&0.63333 \\{6}&\text{   }&0.63194 \\{7}&\text{   }&0.63214 \\{8}&\text{   }&0.63212\end{pmatrix}

Sketch of Proof for the Formula
The key idea is that any sample point in the union E_1 \cup E_2 \cdots \cup E_n is counted in exactly one time in the right hand side of (1). Suppose that a sample point is in exactly t of the events E_1,E_2,\cdots,E_n. Then the following shows the number of times the sample point is counted in each expression:

\displaystyle \sum \limits_{m=1}^{n}P[E_m] \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ t \text{ times}

\displaystyle \sum \limits_{a<b}P[E_a \cap E_b] \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \binom{t}{2} \text{ times}

\displaystyle \sum \limits_{a<b<c}P[E_a \cap E_b \cap E_c] \ \ \ \ \ \ \ \ \ \binom{t}{3} \text{ times and so on}

Thus the sample point in question will be counted exactly H times in the right hand side of the formula.

\displaystyle H=\binom{t}{1}-\binom{t}{2}+\binom{t}{3}- \cdots + (-1)^{t+1}\binom{t}{t}

The following is the derivation that H=1.

\displaystyle 0=(1-1)^t=\sum \limits_{a=0}^{t} \binom{t}{a}(-1)^{a}(1)^{t-a}=\binom{t}{0}-\binom{t}{1}+\binom{t}{2}+ \cdots +(-1)^t \binom{t}{t}

\displaystyle 1=\binom{t}{0}=\binom{t}{1}-\binom{t}{2}- \cdots +(-1)^{t+1} \binom{t}{t}=H


  1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968

Previous Posts on The Matching Problem
The Matching Problem

More About the Matching Problem

Tis the Season for Gift Exchange

Tis the Season for Gift Exchange

Suppose that there are 10 people in a holiday party with each person bearing a gift. The gifts are put in a pile and each person randomly selects one gift from the pile. In order not to bias the random selection of gifts, suppose that the partygoers select gifts by picking slips of papers with numbers identifying the gifts. What is the probability that there is at least one partygoer who ends up selecting his or her own gift? In this example, selecting one’s gift is called a match. What is the probability that there is at least one match if there are more people in the party, say 50 or 100 people? What is the behavior of this probability if the size of the party increases without bound?

The above example is a description of a classic problem in probability called the matching problem. There are many colorful descriptions of the problem. One such description is that there are n married couples in a ballroom dancing class. The instructor pairs the ladies with the gentlemen at random. A match occurs if a married couple is paired as dancing partners.

Whatever the description, the matching problem involves two lists of items that are paired according to a particular order (the original order). When the items in the first list are paired with the items in the second list according to another random ordering, a match occurs if an item in the first list and an item in the second list are paired both in the original order and in the new order. The matching problem discussed here is: what is the probability that there is at least one match? Seen in this light, both examples described above are equivalent mathematically.

We now continue with the discussion of the random gift selection example. Suppose that there are n people in the party. Let E_i be the event that the i^{th} person selects his or her own gift. The event E=E_1 \cup E_2 \cup \cdots \cup E_n is the event that at least one person selects his or her own gift (i.e. there is at least one match). The probability P(E)=P(E_1 \cup E_2 \cup \cdots \cup E_n) is the solution of the matching problem as described in the beginning. The following is the probability P(E)=P(E_1 \cup E_2 \cup \cdots \cup E_n).

\displaystyle (1) \ \ \ \ \ \ P[E_1 \cup E_2 \cup \cdots \cup E_n]=1-\frac{1}{2!}+\frac{1}{3!}-\cdots+(-1)^{n+1} \frac{1}{n!}

The answer in (1) is obtained by using an idea called the inclusion-exclusion principle. We will get to that in just a minute. First let’s look at the results of (1) for a few iterations.

\displaystyle (2) \ \ \ \ \ \  \begin{bmatrix} \text{n}&\text{ }& P[E_1 \cup E_2 \cup \cdots \cup E_n]  \\\text{ }&\text{ }&\text{ } \\ 3&\text{ }&0.666667  \\ 4&\text{ }&0.625000  \\ 5&\text{ }&0.633333  \\ 6&\text{ }&0.631944  \\ 7&\text{ }&0.632143  \\ 8&\text{ }&0.632118 \\ 9&\text{ }&0.632121 \\ 10&\text{ }&0.632121 \\ 11&\text{ }&0.632121 \\ 12&\text{ }&0.632121 \end{bmatrix}

In a party with random gift exchange, it appears that regardless of the size of the party, there is a very good chance that someone will end up picking his or her own gift! A match will happen about 63% of the time.

It turns out that the answers in (1) are related to the Taylor’s series expansion of e^{-1}. We show that the probability in (1) will always converge to 1-e^{-1}=0.632121. Note that the Taylor’s series expansion of e^{-1} is:

\displaystyle (3) \ \ \ \ \ \ e^{-1}=\frac{1}{0!}-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+\cdots+(-1)^n \frac{1}{n!}+\cdots

Consequently, the Taylor’s series expansion of 1-e^{-1} is:

\displaystyle \begin{aligned} (4) \ \ \ \ \ \   1-e^{-1}&=1-\biggl(1-1+\frac{1}{2!}-\frac{1}{3!}+\cdots+(-1)^n \frac{1}{n!}+\cdots \biggr) \\&=1-\frac{1}{2!}+\frac{1}{3!}-\cdots+(-1)^{n+1} \frac{1}{n!}+\cdots  \end{aligned}

Note that the sum of the first n terms of the series in (4) is the probability in (1). As the number of partygoers n increases, the probability P[E_1 \cup E_2 \cup \cdots \cup E_n] will be closer and closer to 1-e^{-1}=0.6321. We have:

\displaystyle (5) \ \ \ \ \ \ \lim \limits_{n \rightarrow \infty} P[E_1 \cup E_2 \cup \cdots \cup E_n]=1 - e^{-1}

\displaystyle (6) \ \ \ \ \ \ \lim \limits_{n \rightarrow \infty} P[(E_1 \cup E_2 \cup \cdots \cup E_n)^c]=e^{-1}

The equation (5) says that it does not matter how many people are in the random gift exchange, the answer to the matching problem is always 1-e^{-1} in the limit. The equation (6) says that the probability of having no matches approaches e^{-1}. There is a better than even chance (0.63 to be more precise) that there is at least one match. So in a random gift exchange as described at the beginning, it is much easier to see a match than not to see one.

The Inclusion-Exclusion Principle

We now want to give some indication why (1) provides the answers. The inclusion-exclusion principle is a formula for finding the probability of the union of events. For the union of two events and the union of three events, we have:

\displaystyle (7) \ \ \ \ \ \ P[E_1 \cup E_2]=P[E_1]+P[E_1]-P[E_1 \cap E_2]

\displaystyle \begin{aligned} (8) \ \ \ \ \ \   P[E_1 \cup E_2 \cup E_3]&=P[E_1]+P[E_1]+P[E_3] \\& \ \ \ \ -P[E_1 \cap E_2]-P[E_1 \cap E_3]-P[E_2 \cap E_3] \\& \ \ \ \ +P[E_1 \cap E_2 \cap E_3]  \end{aligned}

To find the probability of the union of n events, we first add up the probabilities of the individual events E_i. Because the resulting sum overshoots, we need to subtract the probabilities of the intersections E_i \cap E_j. Because the subtractions remove too much, we need to add back the probabilities of the intersections of three individual events E_i \cap E_j \cap E_k. The process of inclusion and exclusion continues until we reach the step of adding/removing of the intersection E_1 \cap \cdots \cap E_n. The following is the statement of the inclusion-exclusion principle.

\displaystyle (9) \ \ \ \ \ \ P[E_1 \cup E_2 \cdots \cup E_n]=S_1-S_2+S_3- \ \ \cdots \ \ +(-1)^{n+1}S_n

In (9), S_1 is the sum of all probabilities P[E_i], S_2 is the sum of all possible probabilities of the intersection of two events E_i and E_j and S_3 is the sum of all possible probabilities of the intersection of three events and so on.

We now apply the inclusion-exclusion principle to derive equation (1). The event E_i is the event that the i^{th} person gets his or her own gift while the other n-1 people are free to select gifts. The probability of this event is \displaystyle P[E_i]=\frac{(n-1)!}{n!}. There are \displaystyle \binom{n}{1}=\frac{n!}{1! (n-1)!} many ways of fixing 1 gift. So \displaystyle S_1=\binom{n}{1} \times \frac{(n-1)!}{n!}=1.

Now consider the intersection of two events. The event E_i \cap E_j is the event that the i^{th} person and the j^{th} person get their own gifts while the other (n-2) people are free to select gifts. The probability of this event is \displaystyle P[E_i \cap E_j]=\frac{(n-2)!}{n!}. There are \displaystyle \binom{n}{2}=\frac{n!}{2! (n-2)!} many ways of fixing 2 gifts. So \displaystyle S_2=\binom{n}{2} \times \frac{(n-2)!}{n!}=\frac{1}{2!}.

By the same reasoning, we derive that \displaystyle S_3=\frac{1}{3!} and \displaystyle S_4=\frac{1}{4!} and so on. Then plugging \displaystyle S_i=\frac{1}{i!} into (9), we obtain the answers in (1).

The matching problem had been discussed previously in the following posts.

The matching problem

Mote about the matching problem