# The matching problem

One of the classic problems in probability theory is the “matching problem.” This problem has many variations and dated back to the early 18th century. There are many ways to describe the problem. One such description is the example of matching letters with envelops. Suppose there are $n$ letters with $n$ matching envelops (assume that each letter has only one matching envelop). Further suppose that the secretary stuffs the letters randomly into envelops. What is the probability that every letter is matched correctly, or that no letter is matched correctly or that exactly $k$ letters are stuffed into the correct envelops? In this post we solve a simpler problem. What is the probability that at least one letter is stuffed into a correct envelop? Though the answer is in terms of the total number of letters $n$, it turns out that the answer is practically independent of $n$ and is roughly $\frac{2}{3}$.

We first states the formula for the union of $n$ events. After we apply this formula to obtain the answer to the matching problem, we give a sketch of a proof to the formula.

The union of n events
For any $n$ events $E_1,E_2, \cdots,E_n$ that are defined on the same sample space, we have the following formula: $P[E_1 \cup E_2 \cup \cdots \cup E_n]=\sum \limits_{m=1}^{n} (-1)^{m+1} \thinspace S_m$ where $S_1=\sum \limits_{r=1}^{n}P[E_r]$, $S_2=\sum \limits_{j, $S_m=\sum P[E_{i(1)} \cap E_{i(2)} \cap \cdots \cap E_{i(m)}]$

Note that in the general term $S_m$, the sum is taken over all increasing sequence $i(\cdot)$, i.e. $1 \le i(1) < i(2) < \cdots < i(m) \le n$. For $n=2,3$, we have the following familiar formulas: $\displaystyle P[E_1 \cup E_2]=P[E_1]+P[E_2]-P[E_1 \cap E_2]$ $\displaystyle P[E_1 \cup E_2 \cup E_3]=$ $\displaystyle P[E_1]+P[E_2]+P[E_3]-P[E_1 \cap E_2]-P[E_1 \cap E_3]-P[E_2 \cap E_3]+P[E_1 \cap E_2 \cap E_3]$

The matching problem
Suppose that the $n$ letters are numbered $1,2, \cdots, n$. Let $E_i$ be the event that the $i^{th}$ letter is stuffed into the correct envelop. Then $P[E_1 \cup E_2 \cup \cdots \cup E_n]$ is the probability that at least one letter is matched with the correct envelop.

The probability of the intersection of $m$ events is: $\displaystyle P[E_{i(1)} \cap E_{i(2)} \cap \cdots \cap E_{i(m)}]=\frac{(n-m)!}{n!}$.

For $m=1$, we have $P[E_i]=\frac{(n-1)!}{n!}=\frac{1}{n}$. Note that the $i^{th}$ position is fixed and we permute the other $n-1$ positions. For $m=2$, we have $P[E_i \cap E_j]=\frac{(n-2)!}{n!}$. Here, the $i^{th}$ and $j^{th}$ positions are fixed and we permute the other $n-2$ positions.

We now show that $S_m=\frac{1}{m!}$. There are $\binom{n}{m}$ number of ways to have $m$ matches out of $n$ letters. Thus we have: $\displaystyle S_m=\binom{n}{m} \frac{(n-m)!}{n!}=\frac{1}{m!}$

Using the formula for the union of $n$ events, we have: $\displaystyle P[\bigcup_{m=1}^{n} E_m]=1-\frac{1}{2!}+\frac{1}{3!}- \cdots + (-1)^{n+1}\frac{1}{n!}$ $\displaystyle 1-P[\bigcup_{m=1}^{n} E_m]=1-1+\frac{1}{2!}-\frac{1}{3!}+ \cdots + (-1)^{n}\frac{1}{n!}$

Note that the left hand side of the above is the first $n+1$ terms of the expansion of $e^{-1}$. Thus we have: $\displaystyle \lim_{n \rightarrow \infty}P[\bigcup_{m=1}^{n} E_m]=\lim_{n \rightarrow \infty}\biggl(1-\frac{1}{2!}+\frac{1}{3!}- \cdots + (-1)^{n+1}\frac{1}{n!}\biggr)$ $=1-e^{-1}=0.632121$

The above limit converges quite rapidly. Let $P_n=P[\bigcup_{m=1}^{n} E_m]$. The following matrix shows the first several terms of this limit. $\displaystyle \begin{pmatrix} n&P_n \\{2}&0.50000 \\{3}&0.66667 \\{4}&0.62500 \\{5}&0.63333 \\{6}&0.63194 \\{7}&0.63214 \\{8}&0.63212\end{pmatrix}$

Sketch of proof
We now discuss the union formula: $P[E_1 \cup E_2 \cup \cdots \cup E_n]=\sum \limits_{m=1}^{n} (-1)^{m+1} \thinspace S_m$

The key idea is that any sample point in the union $E_1 \cup \cdots \cup E_n$ is counted exactly one time in the right hand side of the formula. Suppose that a sample point is in exactly $t$ events $E_{i(1)}, \cdots E_{i(t)}$. Then the following shows the number of times the sample point is counted in each expression: $\displaystyle \sum \limits_{m=1}^{n}P[E_m] \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ t$ times $\displaystyle \sum \limits_{a times $\displaystyle \sum \limits_{a times and so on

Thus the sample point in question will be counted exactly $H$ times in the right hand side of the formula. $\displaystyle H=\binom{t}{1}-\binom{t}{2}+\binom{t}{3}- \cdots + (-1)^{t+1}\binom{t}{t}$

The following is the derivation that $H=1$. $\displaystyle 0=(1-1)^t=\sum \limits_{a=0}^{t} \binom{t}{a}(-1)^{a}(1)^{t-a}=\binom{t}{0}-\binom{t}{1}+\binom{t}{2}+ \cdots +(-1)^t \binom{t}{t}$ $\displaystyle 1=\binom{k}{0}=\binom{t}{1}-\binom{t}{2}- \cdots +(-1)^{t+1} \binom{t}{t}=H$

## 5 thoughts on “The matching problem”

1. Subhash on said:

Excellent site
Can you solve this problem for me
I have 4 (say x) identical chocolates (A,A,A, A) and 3 (say y) additional chocolates (B,C,D) which are completely distinct (as identified above) and I wish to distribute it among 3 (say z) children (P,Q.R,) such that each child P,Q.R, gets atleast one chocolate. How many ways can I distribute – I am looking at a simple formulae (in x,y,z)
[ It is easy if the condition given in last line of each person getting one chocolate is not stated in which case 4 identical gets distributed in
(4 + 5 -1) C ( 5 -1) ways and 5^3 ways for non identical chocolates and we multiply the two results ]
I can get the answer using different cases but becomes long
next part of the question is if 2 more children are added (making it > 4) how does this formula change

2. saj on said:

Hi Dan,

Can you consider a write up for the second part:
What is the probability that exactly k of the letters match their envelope?

3. What’s Going down i am new to this, I stumbled upon this I’ve
found It positively useful and it has aided me
out loads. I am hoping to contribute & help other customers
like its helped me. Good job.