A geometric waiting time occupancy problem

In the provious post (The birthday problem as a waiting time occupancy problem), we presented a waiting time occupancy problem. We like to present another example of a waiting time occupancy problem. This time we also use a birthday problem as illustration. Suppose the proprietor of a restaurant decides that free dessert is offered to any customer dining at the restaurant on his or her birthday. On any given day, how long does it have to wait for the first birthday customer to show up? On average how many customers are expected to visit the restaurant until the first free dessert to be awarded? As we will see in this post, the average number of customers until the arrival of the first birthday customer is $365$. For a small restaurant with a daily customer count under three to four hundred, free dessert may not have to be offered at all. We will also show that the median number of customers until the arrival of the first birthday customer is $253$. So if the daily customer count is less than $253$, there is a greater than $50$% chance that a free dessert will not have to be offered. To obtain these results, we ignore leap year and assume that any day of the year is equally likely to be the birthday of a random customer. The results in this post will not hold if the free dessert offer is widely known and many customers come to the restaurant for the purpose of taking advantage of the free dessert offer.

Obviously, this birthday problem is a waiting time problem. We observe the number of customers until the arrival of the first birthday customer. We can also view this as an occupancy problem. The $365$ days in the year can be considered as cells. The customer stream can be viewed as a series of balls being randomly placed into the $365$ cells. We fix one cell in advance and we keep placing balls into the $365$ cells until the a ball is placed into the specified cell. Once a ball reaches the specified cell, we continue to randomly assign balls into the cells. In the restaurant example, after the arrival of the first birthday customer, we continue to observe customers until the arrival of the next birthday customer.

We now discuss the problem in the context of the occupancy problem. There are $n$ cells and we keep assigning balls (one at a time) into the $n$ cells until a ball is placed into a cell specfied in advance. The random placing of each ball into a cell can be viewed as a Bernoulli trial with probability of success $p=\frac{1}{n}$. Success here means a ball reaches the cell specified in advance. Let $X_1$ be the number of trials (placements of balls) until the first success. Then $X_1$ has a geometric distribution with probability of success $p=\frac{1}{n}$. We make use of some basic facts about the geometric distribution in discussing the stated birthday problem. Let $w$ be any integer greater than $1$ and let $X_w$ be the number of trials until the $w$ balls are placed into the cell specified in advance. Then $X_w$ has a negative binomial distribution. We will also discuss this fact with respect to the birthday problem at hand.

The Geometric Distribution
A random experiment resulting in two distinct outcomes (success or failure) is called a Bernoulli trial (e.g. coin tosses, whether or not the birthday of a customer is the first of January). Suppose that the probability of success in each trial is $p$ and we observe a sequence of Bernoulli trials. Let $X_1$ be the number of Bernoulli trials we observe until a trial resulting in a success. Then $X_1$ has a geometric distribution. The following are some basic facts:

$\displaystyle P(X_1=k)=(1-p)^{k-1} \thinspace p \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle P(X_1>k)=(1-p)^k \ \ \ \ \ \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle E(X_1)=\frac{1}{p}$

$\displaystyle Var(X_1)=\frac{1-p}{p^2}$

If the first success occurs after $k$ trials, then the first $k$ trials are all failures. Thus $P(X_1>k)$ is the same as the probability that there are no successes in the first $k$ trials. Thus $P(X_1>k)$ is as indicated above.

With respect to the occupancy problem, the probability of success (a ball is placed in a cell apecified in advance) is $p=\frac{1}{n}$. The above basic facts for the geometric distribution can be restated as follows:

$\displaystyle P(X_1=k)=\biggl(1-\frac{1}{n} \biggr)^{k-1} \thinspace \frac{1}{n} \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle P(X_1>k)=\biggl(1-\frac{1}{n} \biggr)^k \ \ \ \ \ \ \ \ \ \text{where }k=1,2,3, \cdots$

$\displaystyle E(X_1)=\frac{1}{\frac{1}{n}}=n$

$\displaystyle Var(X_1)=\frac{1-\frac{1}{n}}{\frac{1}{n^2}}=n^2-n$

The Negative Binomial Distribution
Suppose that we observe the Bernoulli trials until we have $w$ successes where $w>1$. Let $X_w$ be the number of trials to obtain $w$ successes. The random variable $X_w$ follows the negative binomial distribution with parameters $w$ and $p$. The following lists some basic facts:

$\displaystyle P(X_w=k)=\binom{k-1}{w-1} (1-p)^{k-w} p^w \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle P(X_w>k)=\sum \limits_{r=0}^{w-1} \binom{k}{r}p^r (1-p)^{k-r} \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle E(X_w)=w E(X_1)=\frac{w}{p}$

$\displaystyle Var(X_w)=w Var(X_1)=w \thinspace \frac{1-p}{p^2}$

If the $w^{th}$ success takes place after $k$ trials, then there can be at most $w-1$ successes in the first $k$ trials. This derives $P(X_w>k)$. Note that $X_w$ is the independent sum of $w$ many random variables, each of which has a geometric distribution. This fact derives the mean and variance indicated above.

With respect to the occupancy problem at hand, the following restates the basic facts:

$\displaystyle P(X_w=k)=\binom{k-1}{w-1} \biggl(1-\frac{1}{n} \biggr)^{k-w} \biggl(\frac{1}{n}\biggr)^w \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle P(X_w>k)=\sum \limits_{r=0}^{w-1} \binom{k}{r}\biggl(\frac{1}{n}\biggr)^r \biggl(1-\frac{1}{n}\biggr)^{k-r} \ \ \ \ \ \ \text{where }k=w,w+1,w+2, \cdots$

$\displaystyle E(X_w)=w E(X_1)=w \thinspace n$

$\displaystyle Var(X_w)=w Var(X_1)=w \thinspace (n^2-n)$

Some Observations about the Birthday Problem
The number of arrivals until the first birthday customer has a geometric distribution. For each customer, the probability of success is $\frac{1}{365}$ (the probability that today is his or her birthday). Thus the mean number of customers until we see a birthday customer is $365$. On the other hand, the median number of customers until the first birthday customer is $253$.

$\displaystyle P(X_1 \le 253)=1-P(X_1>253)=1-\biggl(1-\frac{1}{365}\biggr)^{253}=0.500477154$

$\displaystyle P(X_1 \le 252)=1-P(X_1>252)=1-\biggl(1-\frac{1}{365}\biggr)^{252}=0.4991048385$

Thus, if the number of customers on a given day is less than $253$, there is a greater than $50$% chance that no free dessert will be given out.

If we consider the number of customers until the second birthday customer, the mean is $730$. The median is $613$.

\displaystyle \begin{aligned}P(X_2 \le 613)&=1-P(X_2>613)\\&=1-\sum \limits_{r=0}^{1} \binom{613}{r} \biggl(\frac{1}{365}\biggr)^r \biggl(1-\frac{1}{365}\biggr)^{613-r}\\&=0.500638\end{aligned}

\displaystyle \begin{aligned}P(X_2 \le 612)&=1-P(X_2>612)\\&=1-\sum \limits_{r=0}^{1} \binom{612}{r} \biggl(\frac{1}{365}\biggr)^r \biggl(1-\frac{1}{365}\biggr)^{612-r}\\&=0.499779\end{aligned}

If the birthdays are equally probable across the $365$ days in the year and if the arrivals of customers are truly random, then on any given day it may not be easy to find a birthday customer.

One interesting fact about the waiting time until the first birthday customer is that the geometric distribution has the “no memory” property. On a given day, suppose that $45$ customers have arrived and there is no birthday customer. Would this mean that it is more likely that there will be a birthday customer in the next $10$ customers than at the beginning of the business day? It turns out that the probability of receiving a birthday customer among the next ten customers is the same as the unconditional probability of having a birthday customer in the next ten customers. It is just as likely to have a birthday customer in the next ten customers when $45$ customers have arrived as in the case that no customers have arrived. So the fact that $45$ customers have arrived makes no difference.

Reference

1. Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., John Wiley & Sons, Inc., New York, 1968