The game of poker dice and the multinomial theorem

This post presents an application of the multinomial theorem and the multinomial coefficients to the game of poker dice. See the previous post The multinomial theorem for a discussion of the multinomial theorem and multinomial coefficients.

The game of poker dice is different from the standard poker game. Instead of drawing five cards from a deck of $52$ cards, five fair dice are rolled. The resulting five scores from the dice form a poker dice hand. The possible hands are ranked as follows:

• Five of a kind: one score occurs five times (e.g. $2,2,2,2,2$).
• Four of a kind: two distinct scores appear, one score occurs four times and the other score occurs one time (e.g. $2,2,2,2,5$).
• Full house: two distinct scores appear, one score occurs three times and the other score occurs two times (e.g. $3,3,3,1,1$).
• Three of a kind: three distinct scores appear, one score occurs three times, the other two scores occur one time each (e.g. $2,2,2,3,6$).
• Two pair: three distinct scores appear, two of the scores occur two times each and the other score occurs once (e.g. $4,4,1,1,5$).
• One pair: four distinct scores appear, one score occurs two times and the other three scores occur one time each (e.g. $5,5,1,2,4$).
• High card: five distinct scores occur (e.g. $2,3,5,4,1$).

In rolling $5$ dice, there are $6^5=7776$ ordered outcomes. For example, assuming that the five dice are rolled one at a time, $2,3,2,6,2$ indicates outcome that the first die results in a two and the second die results in a three and so on (this is a three of a kind). To find the probability of a three of a kind, we simply divide the number of ways such hands can occur by $7776$. We use the multinomial coefficients to obtain the number of outcomes for each type of poker dice hands.

Rolling $k$ dice (or rolling a die $k$ times) can also be regarded as the occupancy problem of assigning $k$ balls into $6$ cells. As will be shown below, the problem of computing the probabilities of poker dice hands is seen through the lens of the occupancy problem of randonly placing $5$ balls into $6$ cells. For example, five of a kind is equivalent to all five balls being placed in different cells. Three of a kind is equivalent to three of the balls being in one cell, and the other two balls in two other different cells. For discussion on the occupancy problems in this blog, see the following posts:

In placing $5$ balls into $6$ cells, we use $6$-tuples to indicate how many balls are in each of the $6$ cells. For example, $(0,3,1,0,0,1)$ denotes a three of a kind hand of $2,2,2,3,6$ (the score of two appearing three times, the score of three appearing one time and the score of six appearing one time). Note that the $6$ coordinates represent the six scores of a die (six cells) and the sum of the coordinates is $5$. The $6$-tuple of $(3,1,1,0,0,0)$ is also a three of a kind hand, representing the outcome that the score of one appearing three times, the score of two appearing one time and the score of three appearing one time. We use the multinomial coefficients to determine how many of the $7776$ ordered outcomes correspond to a $6$-tuple such as $(3,1,1,0,0,0)$. With respect to the occupancy problem, such $6$-tuples are called occupancy number sets.

The Multinomial Theorem
For any positive integer $n$ and any positive integer $k$, we have the following polynomial expansion:

$\displaystyle \biggl(x_1+x_2+ \cdots +x_n\biggr)^k=\sum \limits_{a_1+ \cdots +a_n=k} \frac{k!}{a_1! \ a_2! \cdots \ a_n!} \ \ x_1^{a_1} x_2^{a_2} \cdots x_n^{a_n}$

Remark
In addition to being a formula for polynomial expansion, there are two important interpretations of the multinomial theorem and multinomial coefficients. One is that of determining the number of ordered strings that can be formed using a set of alphabets. For example, with one $m$, four $i's$, four $s's$ and two $p's$, there are $\displaystyle \frac{11!}{1! \ 4! \ 4! \ 2!}=\text{34,650}$ possible $11$-letter strings that can be formed, of which $mississippi$ is one specific example.

Another interpretation is that of partitioning a set of distinct objects into several groupings where objects in each grouping are indistinguishable. For example, in a group of $11$ candidates, how many ways can we form four committees such that the Committee 1 has only one member, Committee 2 has four members, Committee 3 has four members and Committee 4 has two members (assuming that each person can only serve in one committee)? The answer, as in above example, is $\text{35,650}$.

Example 1
In a random poker dice hand, what is the probability of obtaining a $4$ two times, a $3$ one time, a $5$ one time and a $6$ one time? Note that this is a specific example of the poker dice hand of one pair.

We consider the $6$-tuple of $(0,0,1,2,1,1)$. We are trying to partition $5$ scores into four subgroups, one group having two identical scores of $4$, one group with a score of $3$, one group with a score of $5$ and one group with a score of $6$. Thus consider the following multinomial coefficient:

$\displaystyle \frac{5!}{1! \ 2! \ 1! \ 1!}=60$

So out of $\text{7,776}$ possible hands, $60$ of them satisfiy the condition that a $4$ appearing two times, a $3$ appearing one time, a $5$ appearing one time and a $6$ appearing one time. The probability is:

$\displaystyle \frac{60}{7776}=0.0077160494$

Example 2
What is the probability that one score appears two times, three other scores appear one time each in a random poker dice hand?

Here, we need to count all the possible poker dice hands of one pair. Both $(0,0,1,2,1,1)$ and $(2,1,1,1,0,0)$ are examples of one pair. In essense, we need to count all the occupancy number sets such that among the $6$ coordinates (cells), one of the cells is a $2$ and three of the cells are $1$. To this end, we apply the multinomial theorem twice, one time on the five rolls of dice and one time one the $6$ cells.

Consider the occupancy number set $(2,1,1,1,0,0)$. Note that the multinomial coefficient is $60$ as in Example $1$ (the first application of the multinomial thoerem). Now look at the $6$ coordinates of the occupancy number set $(2,1,1,1,0,0)$. We wish to partition these $6$ coordinates into three groupings, one with one $2$, one with three $1's$ and one with two $0's$. The following is the multinomial coefficient (the second application of the multinomial theorem):

$\displaystyle \frac{6!}{1! \ 3! \ 2!}=60$

Thus the number of possible poker dice hands of one pair is: $60 \times 60=\text{3,600}$ and for a random poker dice hand, the probability that it is a one pair is:

$\displaystyle \frac{3600}{7776}=0.462962963$

Remark
Example $2$ provides the algorithm for computing the remaining poker dice hand probabilities. The key is to apply the multinomial coefficients twice, one time on a representative occupancy number set, the second time on the six cells (the six faces of a die in this case). Then the number of poker dice hands in question is the product of the two multinomial coefficients.

Example 3
What is the probability that a random poker dice hand is three of a kind?

Consider the occupancy number set of $(3,1,1,0,0,0)$. The associated multinomial coefficient for the five rolls of dice is:

$\displaystyle \frac{5!}{3! \ 1! \ 1!}=20$

Now partition the six cells into three groupings (one $3$, two $1's$, three $0's$):

$\displaystyle \frac{6!}{1! \ 2! \ 3!}=60$

Thus the probability that a random poker hand is three of a kind is:

$\displaystyle \frac{20 \times 60}{7776}=0.1543209877$

Summary
The following are the probabilities of poker dice hands.

$\displaystyle P(\text{five of a kind})=\frac{6}{7776}$

$\displaystyle P(\text{four of a kind})=\frac{150}{7776}$

$\displaystyle P(\text{full house})=\frac{300}{7776}$

$\displaystyle P(\text{three of a kind})=\frac{1200}{7776}$

$\displaystyle P(\text{two pair})=\frac{1800}{7776}$

$\displaystyle P(\text{one pair})=\frac{3600}{7776}$

$\displaystyle P(\text{high card})=\frac{720}{7776}$

________________________________________________________________________
$\copyright \ \text{2010 - 2015 by Dan Ma}$ (Revised March 28, 2015)

The multinomial theorem

The multinomial theorem is a statement about expanding a polynomial when it is raised to an arbitrary power. Rather than just stating the theorem and showing examples, we motivate the theorem by a concrete example of finite random sampling. This example demonstrates that the notion of finite sampling provides another interpretation to the multinomial thoerem. As a result, the multinomial theorem and the multinomial coefficients are useful in combinatorial techniques in finite random sampling models. The multinomial coefficients are also useful in partitioning a set of objects into several subgroups where each subgroup is made up of indistinguishable objects. See the post The game of poker dice and the multinomial theorem for an example of applications of these ideas.

________________________________________________________________________

An example

Suppose we have a finite set $S$ with $n$ elements where $n$ is a positive integer. Suppose we select $k$ elements from the population $S$ with replacement. We consider ordered samples of size $k$ and unordered samples of size $k$. To make this notion more concrete, consider the population consisting of $4$ letters $\lbrace{m,i,s,p}\rbrace$. Suppose we sample $11$ times with replacement. The following are two specific ordered samples of size $11$:

$mississippi \ \ \ \ \ \ \ \ \ \ \ \ mississiipp \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)$

Thus ordered samples can be represented as strings of characters drawn from the population presented in the order in which the objects are drawn. In this example, there are $4^{11}=4194304$ many ordered samples. Each character drawn has $4$ choices and the drawing is done $11$ times. In general, if the population has size $n$, then there are $n^k$ many ordered samples of size $k$.

We now look at unordered samples obtained from sampling $11$ times from the population $\lbrace{m,i,s,p}\rbrace$. In unordered samples, the order in which the letters are drawn is no longer important (or recorded). We only care about the number of instances each letter is selected. Thus each of the ordered samples shown above yields the same unordered sample: $1 \ m$, $4 \ i's$, $4 \ s's$ and $2 \ p's$. We represent the unordered sample in two ways:

$(1,4,4,2)$

$* \ | \ * \ * \ * \ * \ | \ * \ * \ * \ * \ | \ * \ *$

The first representation is a $4$-tuple where the coordinates are the number of $m's$, the number of $i's$, the number of $s's$ and the number of $p's$ in the $11$ selections. Thus, the sum of the coordinates must be $11$ (the total number of selections). The second representation of the unordered sample is made up of stars and bars. The three bars create $4$ spaces and the stars in each space represent the number of characters drawn. We would like to point out that the “unordered” in the unordered samples refers to the fact that the order in which the objects are drawn is immaterial. However, both the $4$-tuple notation and the stars and bars diagrams are ordered according to the $4$ letters in the population (showing how many instances each object appears in the samples).

The star and bar diagram has a combinatorial advantage. For example, how many unordered samples are there when you draw $11$ letters with replacement from the population $\lbrace{m,i,s,p}\rbrace$? In any unordered sample, there are $3$ bars and $11$ stars. The order of the stars and bars can be in any arbitrary order. Thus there are $\displaystyle \binom{11+3}{11}=364$ many unordered samples. In general, if the population size is $n$, then there are $\displaystyle \binom{k+n-1}{k}=\binom{k+n-1}{n-1}$ many unordered samples, either represented as $k$-tuples or stars and bars diagrams with $n-1$ bars and $k$ stars.

One more note about the number $364$ calculated above. This is also the total number of non-negative integer solutions to the equation $m+i+s+p=11$. Thinking of an unordered sample as a $4$-tuple, the sum of the $4$ coordinates must be $11$. This observation is important to understanding the multinomial theorem.

There is one more count associated with unordered samples. How many unordered samples are there when you draw $11$ letters with replacement from the population $\lbrace{m,i,s,p}\rbrace$ such that each letter is selected at least once? The answer is $120$. Suppose that each letter is already selected once. Then we need to sample $7$ more times out of these $4$ letters. According to the above paragraph, the total count is $\displaystyle \binom{7+3}{3}=120$. To generalize, if the population size is $n$, there are $\displaystyle \binom{k-n+n-1}{n-1}=\binom{k-1}{n-1}$ many unordered samples in which all objects in the population are represented in each sample.

We now tie unordered samples back to ordered samples. How many ordered samples are equivalent to the unordered sample $(1,4,4,2)$? Both ordered samples in $(1)$ are equivalent to $(1,4,4,2)$ (i.e. each letter is drawn the same number of times). In other words, how many $11$-character strings can be formed using $1 \ m$, $4 \ i's$, $4 \ s's$ and $2 \ p's$? The answer is:

\displaystyle \begin{aligned}\binom{11}{1} \binom{10}{4} \binom{6}{4} \binom{2}{2}&=\displaystyle \frac{11!}{1! \ 4! \ 4! \ 2!}\\&=34650\end{aligned}

The reasoning for the above calculation is that out of $11$ positions in the strings, we choose $1$ position for the $m$, choose $4$ positions for the $i$ in the remaining $10$ positions and so on.

________________________________________________________________________

Summary of the example

In sampling $11$ times with replacement from the population $\lbrace{m,i,s,p}\rbrace$, we summarize our observations about the example.

• The number of ordered samples is $4^{11}=4194304$.
• The number of unordered samples is $\displaystyle \binom{11+3}{11}=364$. Furthermore, the number of non-negative integer solutions to the equation $m+i+s+p=11$ is $364$.
• The number of unordered samples where each letter is selected is $\displaystyle \binom{7+3}{3}=120$. Furthermore, the number of positive integer solutions to the equation $m+i+s+p=11$ is $120$.
• For any unordered sample $(a,b,c,d)$ where $a+b+c+d=11$, the total number of ordered samples equivalent to this unordered sample is $\displaystyle \frac{11!}{a! \ b! \ c! \ d!}$. As we shall see, these are called multinomial coefficients.

Note the interplay between the ordered samples and unordered samples. We start out with a large number of ordered samples ($4^{11}$ many). We then collapse these ordered samples to just $364$ unordered samples. We see that each unordered sample corresponds to certain number of ordered samples according to the multinomial coefficients. Thus we have the following sum where $a,b,c,d$ are non-negative integers:

$\displaystyle \sum \limits_{a+b+c+d=11} \frac{11!}{a! \ b! \ c! \ d!}=4^{11}$

________________________________________________________________________

The Multinomial Theorem

With the preceding discussion, we now state the multinomial theorem.

The Multinomial Theorem
For any positive integer $n$ and any positive integer $k$, we have the following polynomial expansion:

$\displaystyle \biggl(x_1+x_2+ \cdots +x_n\biggr)^k=\sum \limits_{a_1+ \cdots +a_n=k} \frac{k!}{a_1! \ a_2! \cdots \ a_n!} \ x_1^{a_1} x_2^{a_2} \cdots x_n^{a_n}$

Remark
The same observations we make about the example apply. For example, the number of terms in the polynomial expansion is $\displaystyle \binom{k+n-1}{k}$, which is the number of non-negative integer solutions to the equation $a_1+ \cdots +a_n=k$. Each term $x_1^{a_1} x_2^{a_2} \cdots x_n^{a_n}$ in the polynomial expansion can be considered as an unordered sample in the finite sampling with replacement. Then the coefficient of each term (called multinomial coefficient) is the number of associated ordered samples. As a result, the multinomial coefficients sum to $n^k$.

We conclude with two interpretations of the multinomial coefficient.

$\displaystyle \frac{k!}{a_1! \ a_2! \cdots \ a_n!} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (2)$

If we have $n$ many distinct symbols (say $x_1,x_2, \cdots, x_n$) and we have $a_1$ many $x_1$, $a_2$ many $x_2$ and so on, then the multinomial coefficient in $(2)$ is the number of $k$-length character strings that can be formed using the available symbols.

Another interpretation is that of partitioning a set of $k$ objects into $n$ subgroups where the objects in each subgroup are indistinguishable.

Both interpretations are one and the same, just looking at the same result in a different angle. For example, all three of the following yield the same answer: $\text{34,650}$. We have $11$ letters (one $m$, four $i's$, four $s's$ and two $p's$), how many character strings can be formed with these letters?

On the other hand, we have $11$ identical candies randomly distributed to Marcus, Issac, Samantha and Paul. How many ordered samples will result if Marcus receives one candy, Issac receives four candies, Samantha receives four candies and Paul receives two candies? Here, we are trying to partition a set of $11$ objects into four subgroups where one group has one element, two of the groups have four elements each and another group has two elements.

If we have $11$ candidates for forming four distinct committees where one committee has one member, two of the committees have four members each and another one has two members. In how many ways can this be done?

Reference

1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968

________________________________________________________________________
Revised March 28, 2015
$\copyright \ \text{2010 - 2015 by Dan Ma}$

A geometric waiting time occupancy problem

In the provious post (The birthday problem as a waiting time occupancy problem), we presented a waiting time occupancy problem. We like to present another example of a waiting time occupancy problem. This time we also use a birthday problem as illustration. Suppose the proprietor of a restaurant decides that free dessert is offered to any customer dining at the restaurant on his or her birthday. On any given day, how long does it have to wait for the first birthday customer to show up? On average how many customers are expected to visit the restaurant until the first free dessert to be awarded? As we will see in this post, the average number of customers until the arrival of the first birthday customer is $365$. For a small restaurant with a daily customer count under three to four hundred, free dessert may not have to be offered at all. We will also show that the median number of customers until the arrival of the first birthday customer is $253$. So if the daily customer count is less than $253$, there is a greater than $50$% chance that a free dessert will not have to be offered. To obtain these results, we ignore leap year and assume that any day of the year is equally likely to be the birthday of a random customer. The results in this post will not hold if the free dessert offer is widely known and many customers come to the restaurant for the purpose of taking advantage of the free dessert offer.

Obviously, this birthday problem is a waiting time problem. We observe the number of customers until the arrival of the first birthday customer. We can also view this as an occupancy problem. The $365$ days in the year can be considered as cells. The customer stream can be viewed as a series of balls being randomly placed into the $365$ cells. We fix one cell in advance and we keep placing balls into the $365$ cells until the a ball is placed into the specified cell. Once a ball reaches the specified cell, we continue to randomly assign balls into the cells. In the restaurant example, after the arrival of the first birthday customer, we continue to observe customers until the arrival of the next birthday customer.

We now discuss the problem in the context of the occupancy problem. There are $n$ cells and we keep assigning balls (one at a time) into the $n$ cells until a ball is placed into a cell specfied in advance. The random placing of each ball into a cell can be viewed as a Bernoulli trial with probability of success $p=\frac{1}{n}$. Success here means a ball reaches the cell specified in advance. Let $X_1$ be the number of trials (placements of balls) until the first success. Then $X_1$ has a geometric distribution with probability of success $p=\frac{1}{n}$. We make use of some basic facts about the geometric distribution in discussing the stated birthday problem. Let $w$ be any integer greater than $1$ and let $X_w$ be the number of trials until the $w$ balls are placed into the cell specified in advance. Then $X_w$ has a negative binomial distribution. We will also discuss this fact with respect to the birthday problem at hand.

The Geometric Distribution
A random experiment resulting in two distinct outcomes (success or failure) is called a Bernoulli trial (e.g. coin tosses, whether or not the birthday of a customer is the first of January). Suppose that the probability of success in each trial is $p$ and we observe a sequence of Bernoulli trials. Let $X_1$ be the number of Bernoulli trials we observe until a trial resulting in a success. Then $X_1$ has a geometric distribution. The following are some basic facts:

$\displaystyle P(X_1=k)=(1-p)^{k-1} \thinspace p \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle P(X_1>k)=(1-p)^k \ \ \ \ \ \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle E(X_1)=\frac{1}{p}$

$\displaystyle Var(X_1)=\frac{1-p}{p^2}$

If the first success occurs after $k$ trials, then the first $k$ trials are all failures. Thus $P(X_1>k)$ is the same as the probability that there are no successes in the first $k$ trials. Thus $P(X_1>k)$ is as indicated above.

With respect to the occupancy problem, the probability of success (a ball is placed in a cell apecified in advance) is $p=\frac{1}{n}$. The above basic facts for the geometric distribution can be restated as follows:

$\displaystyle P(X_1=k)=\biggl(1-\frac{1}{n} \biggr)^{k-1} \thinspace \frac{1}{n} \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle P(X_1>k)=\biggl(1-\frac{1}{n} \biggr)^k \ \ \ \ \ \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle E(X_1)=\frac{1}{\frac{1}{n}}=n$

$\displaystyle Var(X_1)=\frac{1-\frac{1}{n}}{\frac{1}{n^2}}=n^2-n$

The Negative Binomial Distribution
Suppose that we observe the Bernoulli trials until we have $w$ successes where $w>1$. Let $X_w$ be the number of trials to obtain $w$ successes. The random variable $X_w$ follows the negative binomial distribution with parameters $w$ and $p$. The following lists some basic facts:

$\displaystyle P(X_w=k)=\binom{k-1}{w-1} (1-p)^{k-w} p^w \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle P(X_w>k)=\sum \limits_{r=0}^{w-1} \binom{k}{r}p^r (1-p)^{k-r} \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle E(X_w)=w E(X_1)=\frac{w}{p}$

$\displaystyle Var(X_w)=w Var(X_1)=w \thinspace \frac{1-p}{p^2}$

If the $w^{th}$ success takes place after $k$ trials, then there can be at most $w-1$ successes in the first $k$ trials. This derives $P(X_w>k)$. Note that $X_w$ is the independent sum of $w$ many random variables, each of which has a geometric distribution. This fact derives the mean and variance indicated above.

With respect to the occupancy problem at hand, the following restates the basic facts:

$\displaystyle P(X_w=k)=\binom{k-1}{w-1} \biggl(1-\frac{1}{n} \biggr)^{k-w} \biggl(\frac{1}{n}\biggr)^w \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle P(X_w>k)=\sum \limits_{r=0}^{w-1} \binom{k}{r}\biggl(\frac{1}{n}\biggr)^r \biggl(1-\frac{1}{n}\biggr)^{k-r} \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle E(X_w)=w E(X_1)=w \thinspace n$

$\displaystyle Var(X_w)=w Var(X_1)=w \thinspace (n^2-n)$

Some Observations about the Birthday Problem
The number of arrivals until the first birthday customer has a geometric distribution. For each customer, the probability of success is $\frac{1}{365}$ (the probability that today is his or her birthday). Thus the mean number of customers until we see a birthday customer is $365$. On the other hand, the median number of customers until the first birthday customer is $253$.

$\displaystyle P(X_1 \le 253)=1-P(X_1>253)=1-\biggl(1-\frac{1}{365}\biggr)^{253}=0.500477154$

$\displaystyle P(X_1 \le 252)=1-P(X_1>252)=1-\biggl(1-\frac{1}{365}\biggr)^{252}=0.4991048385$

Thus, if the number of customers on a given day is less than $253$, there is a greater than $50$% chance that no free dessert will be given out.

If we consider the number of customers until the second birthday customer, the mean is $730$. The median is $613$.

\displaystyle \begin{aligned}P(X_2 \le 613)&=1-P(X_2>613)\\&=1-\sum \limits_{r=0}^{1} \binom{613}{r} \biggl(\frac{1}{365}\biggr)^r \biggl(1-\frac{1}{365}\biggr)^{613-r}\\&=0.500638\end{aligned}

\displaystyle \begin{aligned}P(X_2 \le 612)&=1-P(X_2>612)\\&=1-\sum \limits_{r=0}^{1} \binom{612}{r} \biggl(\frac{1}{365}\biggr)^r \biggl(1-\frac{1}{365}\biggr)^{612-r}\\&=0.499779\end{aligned}

If the birthdays are equally probable across the $365$ days in the year and if the arrivals of customers are truly random, then on any given day it may not be easy to find a birthday customer.

One interesting fact about the waiting time until the first birthday customer is that the geometric distribution has the “no memory” property. On a given day, suppose that $45$ customers have arrived and there is no birthday customer. Would this mean that it is more likely that there will be a birthday customer in the next $10$ customers than at the beginning of the business day? It turns out that the probability of receiving a birthday customer among the next ten customers is the same as the unconditional probability of having a birthday customer in the next ten customers. It is just as likely to have a birthday customer in the next ten customers when $45$ customers have arrived as in the case that no customers have arrived. So the fact that $45$ customers have arrived makes no difference.

Reference

1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968

The birthday problem as a waiting time occupancy problem

Ignoring leap year and assuming that each of the $365$ days in the year is equally likely to be the birthday for a randomly chosen person, the birthday problem in probability is to find the minimum number of persons such that there is a better than $50$% chance of having at least two people with the same birthday. This version of the birthday problem can be viewed as a waiting time problem, i.e. as a problem of randomly selecting people (one at a time) until there is a common birthday in the sample. This problem can also be viewed as an occupancy problem. Consider the $365$ days in the year as cells and the selection of a person at random as placing a ball at random into one of the $365$ cells. Then the random placement of balls (one at a time) continues until one of the cells has two balls.

This post is to explore the connection between the birthday problem as stated and the wating time occupancy problem. As the sake of completeness, we first state the two problems.

The Birthday Problem
Ignoring leap year and assuming that each of the $365$ days in the year is equally likely to be the birthday for a randomly chosen person, how many people do we need to choose in order to have a better than $50$% chance of having a common birthday among the selected individuals?

The Waiting Time Occupancy Problem
Suppose that balls are placed randomly (one at a time) into $n$ cells. The random placing of balls continues until a ball is placed into a cell that already has one ball (i.e. until a “repeat” occurs). Let $X_n$ be the number of balls required to obtain a repeat. Our goal is to discuss the probability function $P(X_n=k)$, and the survival function $P(X_n > k)$. Of course, we will discuss the connection with the birthday problem.

Remark
As we will see, the birthday problem can be restated as finding the median of the random variable in the waiting time occupancy problem with $365$ cells. We will show that $P(X_{365} \le 23)>\frac{1}{2}$ and $P(X_{365} \le 22)<\frac{1}{2}$.

A more basic version of the occupancy problem works with a fixed number of balls (say $k$ balls) and is about randomly placing the $k$ balls into $n$ cells. In the following two posts in this blog, we discuss this fixed length occupancy problem and present ways to compute the probability that $j$ cells are empty where $j=0,1, \cdots, n-1$.

In the waiting time occupancy problem discussed in this post, we do not fix the number of balls. Instead, we keep placing balls into cells until a certain prescribed condition takes place. So the focus is on the waiting time until the occurence of a certain event. With respect to the occupancy problem, examples of waiting times include the time until a repeat occurs (discussed here), the time until a specified cell is filled and the time until all cells are filled. Some of these waiting time problems will be discussed in subsequent posts.

Solving the Birthday Problem
In a random group of $k$ people, the following is the probability that all $k$ birthdays are different:

\displaystyle \begin{aligned}P_k&=\frac{365}{365} \thinspace \frac{364}{365} \thinspace \cdots \thinspace \frac{365-(k-1)}{365}\\&=\frac{364}{365} \thinspace \frac{363}{365} \thinspace \cdots \thinspace \frac{365-(k-1)}{365}\\&=\biggl(1-\frac{1}{365}\biggr) \thinspace \cdots \thinspace \biggl(1-\frac{k-1}{365}\biggr) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1) \end{aligned}

Note that for $k=23$, the probability of having different birthdays is $p_{23}=0.492702766$. Thus there is more than $50$% chance of having at least a common birthday among $23$ people ($1-p_{23}=0.507297234$). On the other hand, $p_{22}=0.524304692$ and $1-p_{22}<0.5$. Thus among $22$ people, the probability of having a common birthday is less than $\frac{1}{2}$.

The Discussion of the Waiting Time Occupancy Problem
In the random assignment of a series of balls (one at a time) into $n$ cells, consider the event that the random pacement of balls terminates at the $k^{th}$ step. This means that the first $k-1$ balls go into $k-1$ different cells and the $k^{th}$ ball goes into one of the $k-1$ previously occupied cell. The probability of this event is:

\displaystyle \begin{aligned}P(X_n=k)&=\frac{n (n-1) \cdots (n-(k-2)) (k-1)}{n^k}\\&=\frac{(n-1) \cdots (n-(k-2)) (k-1)}{n^{k-1}}\\&=\biggl(1-\frac{1}{n}\biggr) \thinspace \cdots \thinspace \biggl(1-\frac{k-2}{n}\biggr) \frac{k-1}{n}\end{aligned}

Note that the number of ways of assigning $k-1$ balls into distinct cells among $n$ cells is $n(n-1) \cdots (n-(k-2))$. Then there are $k-1$ ways of assigning the $k^{th}$ ball to the one of the previously assigned cells. Thus the total number of assigning $k$ balls in this fashion is:

$\displaystyle n (n-1) \cdots (n-(k-2)) (k-1)$

We now discuss $P(X_n>k)$. This is the probability that it takes more than $k$ balls to have two balls in the same cell. Consequently, the first $k$ balls all go into different cells. When $n=365$, $P(X_{365}>k)$ should agree with the probability stated in $(1)$ above. Thus we have:

$\displaystyle P(X_n>k)=\biggl(1-\frac{1}{n}\biggr) \ \cdots \ \biggl(1-\frac{k-1}{n}\biggr)$

The above survival function can also be proven by induction. To find the median of the distribution $X_n$, we need to find $m$ such that:

$P(X_n \le m)=1-P(X_n>m)>\frac{1}{2}$ and
$P(X_n \le m-1)=1-P(X_n>m-1)<\frac{1}{2}$

For the birthday problem, $n=365$. It has been shown in the above solution of the birthday problem that the median is $m=23$.

Example
This is one more example to illustrate the computation. Suppose we roll a fair die repeatedly until a repeat of a previously shown face. Each roll of the die is essentially placing a ball into one of the six cells. Let $X_6$ be the number of rolls to obtain a repeat. The following shows the probability function $P(X_6=k)$.

$\displaystyle P(X_6=2)=\frac{1}{6}$

$\displaystyle P(X_6=3)=\frac{10}{6^2}$

$\displaystyle P(X_6=4)=\frac{60}{6^3}$

$\displaystyle P(X_6=5)=\frac{240}{6^4}$

$\displaystyle P(X_6=6)=\frac{600}{6^5}$

$\displaystyle P(X_6=7)=\frac{720}{6^6}$

The median number of rolls to achieve a repeat is $4$. Note the following probabilities:

$\displaystyle P(X_6 \le 4)=1-P(X_6>4)=1-\frac{60}{6^3}=\frac{156}{6^3}=0.72$

$\displaystyle P(X_6 \le 3)=1-P(X_6>3)=1-\frac{20}{6^2}=\frac{16}{6^2}=0.44$

The following shows that the mean number of rolls to achieve a repeat is $3.77469$.

\displaystyle \begin{aligned}E(X_6)&=2 \biggl(\frac{1}{6}\biggr)+3 \biggl(\frac{10}{6^2}\biggr)+4 \biggl(\frac{60}{6^3}\biggr)+5 \biggl(\frac{240}{6^4}\biggr)+6 \biggl(\frac{600}{6^5}\biggr)+7 \biggl(\frac{720}{6^6}\biggr)\\&=\frac{176112}{6^6}\\&=3.774691358\end{aligned}

Reference

1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968

A formula for the occupancy problem

The occupancy problem in probability deals with the random assigment of $k$ balls into $n$ cells. One problem of interest is on the number of cells that are occupied or the number of cells that are empty. When randomly assigning $k$ balls into $n$ cells, what is the probability of finding any empty cell? What is the probability of finding exactly $j$ empty cells? In the previous post (The occupancy problem), we discussed the algorithm for solving this problem by the counting method based on the multinomial coefficients. In this post, we derive a direct formula for the probability of having $j$ empty cells when randomly placing $k$ balls into $n$ cells (see Theorem 2 below). The derivation is based on a theorem about the union of events (see Theorem 1 below). For more background information on the occupancy problem, see volume I of Feller in [1].

Theorem 1 concerns with the union of $n$ events. A provious post in this blog (The matching problem) presented a sketch of a proof of this theorem and a solution of one version of the matching problem. After proving Theorem 2, we present a direct calculation for the example in the previous post.

Suppose that we have the events $E_1,E_2, \cdots, E_n$. Define $S_m$ as the sum of all probabilities of $m$ many $E_i$ occuring simultaneously. Thus we have:

$\displaystyle S_1=\sum \limits_{i=1}^{n} P(E_i)$

$\displaystyle S_2=\sum \limits_{i

$\displaystyle S_3=\sum \limits_{i

$\displaystyle S_m=\sum \limits_{i(1)< \cdots < i(m)} P(E_{i(1)} \cap E_{i(2)} \cap \cdots \cap E_{i(m)})$

Thoerem 1
The following is the probability that at least one of the events $E_1,E_2, \cdots, E_n$ has occurred:

$\displaystyle P(E_1 \cup E_2 \cup \cdots \cup E_n)= \sum \limits_{m=1}^{n} (-1)^{m+1} S_m$

Note that when $m=2$ or $m=3$, we have the following familiar formulas:

$\displaystyle P(E_1 \cup E_2)=P(E_1)+P(E_2)-P(E_1 \cap E_2)$

\displaystyle \begin{aligned}P(E_1 \cup E_2 \cap E_3)&=P(E_1)+P(E_2)+P(E_3)\\&\ \ \ -P(E_1 \cap E_2)-P(E_1 \cap E_3)-P(E_2 \cap E_3)\\&\ \ \ +P(E_1 \cap E_2 \cap E_3)\end{aligned}

Theorem 2
In randomly placing $k$ balls into $n$ cells, let $X_{k,n}$ be the number of empty cells. Then the probability that all cells are occupied (zero cell is empty) and the probability that exactly $j$ cells are empty are, respectively,

$\displaystyle P(X_{k,n}=0)=\sum \limits_{i=0}^{n} (-1)^{i} \binom{n}{i} \biggl(1-\frac{i}{n}\biggr)^{k}$

$\displaystyle P(X_{k,n}=j)=\binom{n}{j} \sum \limits_{i=0}^{n-j} (-1)^{i} \binom{n-j}{i} \biggl(1-\frac{j+i}{n}\biggr)^{k}$

where $j=1,2, \cdots, n-1$.

Proof. Let $E_i$ be the event that the $i^{th}$ cell is empty, where $i=1,2, \cdots, n$. We proceed to derive the sums $S_m$ as in Theorem 1.

For the event $E_i$ to occur (the $i^{th}$ cell being empty), the $k$ balls in question are placed in the remaining $n-1$ cells. This can be done in $(n-1)^k$ ways. Thus $\displaystyle P(E_i)=\frac{(n-1)^k}{n^k}=\biggl(1-\frac{1}{n}\biggr)^k$.

For the event $E_i \cap E_j$ to occur (the $i^{th}$ and $j^{th}$ cells being empty), the $k$ balls are placed in the remaining $n-2$ cells. This can be done in $(n-2)^k$ ways. Thus we have $\displaystyle P(E_i \cap E_j)=\frac{(n-2)^k}{n^k}=\biggl(1-\frac{2}{n}\biggr)^k$.

Likewise, $\displaystyle P(E_i \cap E_j \cap E_w)=\frac{(n-3)^k}{n^k}=\biggl(1-\frac{3}{n}\biggr)^k$, and so on. For $m \le n$, we have $\displaystyle S_m=\binom{n}{m} \biggl(1-\frac{m}{n}\biggr)^k$.

Thus, the following is the probability that at least one cell is empty:

\displaystyle \begin{aligned}P(E_1 \cup E_2 \cup \cdots \cup E_n)&=\sum \limits_{m=1}^{n} (-1)^{m+1} S_m\\&=\sum \limits_{m=1}^{n} (-1)^{m+1} \binom{n}{m} \biggl(1-\frac{m}{n}\biggr)^k\end{aligned}

The probability of having no empty cell is:

\displaystyle \begin{aligned}P(X_{k,n}=0)&=1-\sum \limits_{m=1}^{n} (-1)^{m+1} \binom{n}{m} \biggl(1-\frac{m}{n}\biggr)^k\\&=\sum \limits_{m=0}^{n} (-1)^{m} \binom{n}{m} \biggl(1-\frac{m}{n}\biggr)^k\end{aligned}

If there are exactly $j$ empty cells, then the $k$ balls are placed in the remaining $n-j$ cells and these remaining $n-j$ cells are all occupied. Then the following is the probability of having exactly $j$ empty cells:

\displaystyle \begin{aligned}P(X_{k,n}=j)&=\binom{n}{j} \biggl(1-\frac{j}{n}\biggr)^k \times P(X_{k,n-j}=0)\\&=\binom{n}{j} \biggl(1-\frac{j}{n}\biggr)^k \times \sum \limits_{m=0}^{n-j} (-1)^{m} \binom{n-j}{m} \biggl(1-\frac{m}{n-j}\biggr)^k\\&=\binom{n}{j} \sum \limits_{m=0}^{n-j} (-1)^m \binom{n-j}{m} \biggl(1-\frac{j+m}{n}\biggr)^k\end{aligned}

Example
In rolling a fair die $7$ times, what is the probability of having exactly $j$ faces appearing? This is the same example in the previous post (The occupancy problem). In this post we apply the formulas in Theorem 2.

Note that the rolling a die can be considered as the placing of a ball into one of the $6$ cells. Thus we have $k=7$ balls and $n=6$ cells. The probability of having $j$ cells occupied is the probability of $6-j$ empty cells. As in Theorem 2, $\displaystyle X_{7,6}$ is the number of empty cells when placing $7$ balls into $6$ cells at random. Thus the probability of having exactly $j$ cells occupied is $\displaystyle P(X_{7,6})=6-j$.

\displaystyle \begin{aligned}P(X_{7,6}=0)&=P(\text{all 6 cells are occupied})\\&=\sum \limits_{m=0}^{6} (-1)^{m} \binom{6}{m} \biggl(1-\frac{m}{6}\biggr)^7\\&=\frac{15120}{6^7}\\&=\frac{15120}{279936}\\&=0.0540123457\end{aligned}

\displaystyle \begin{aligned}P(X_{7,6}=1)&=P(\text{exactly 5 cells are occupied})\\&=\binom{6}{1} \sum \limits_{m=0}^{5} (-1)^{m} \binom{5}{m} \biggl(1-\frac{m+1}{6}\biggr)^7\\&=\frac{100800}{6^7}\\&=\frac{100800}{279936}\\&=0.3600823045\end{aligned}

\displaystyle \begin{aligned}P(X_{7,6}=2)&=P(\text{exactly 4 cells are occupied})\\&=\binom{6}{2} \sum \limits_{m=0}^{4} (-1)^{m} \binom{4}{m} \biggl(1-\frac{m+2}{6}\biggr)^7\\&=\frac{126000}{6^7}\\&=\frac{126000}{279936}\\&=0.4501028807\end{aligned}

\displaystyle \begin{aligned}P(X_{7,6}=3)&=P(\text{exactly 3 cells are occupied})\\&=\binom{6}{3} \sum \limits_{m=0}^{3} (-1)^{m} \binom{3}{m} \biggl(1-\frac{m+3}{6}\biggr)^7\\&=\frac{36120}{6^7}\\&=\frac{36120}{279936}\\&=0.1290294925\end{aligned}

\displaystyle \begin{aligned}P(X_{7,6}=4)&=P(\text{exactly 2 cells are occupied})\\&=\binom{6}{4} \sum \limits_{m=0}^{2} (-1)^{m} \binom{2}{m} \biggl(1-\frac{m+4}{6}\biggr)^7\\&=\frac{1890}{6^7}\\&=\frac{1890}{279936}\\&=0.0067515432\end{aligned}

\displaystyle \begin{aligned}P(X_{7,6}=5)&=P(\text{exactly 1 cell is occupied})\\&=\binom{6}{5} \sum \limits_{m=0}^{1} (-1)^{m} \binom{1}{m} \biggl(1-\frac{m+5}{6}\biggr)^7\\&=\frac{6}{6^7}\\&=\frac{6}{279936}\\&=0.00002143347051\end{aligned}

Reference

1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968