A counting distribution is a discrete distribution with probabilities only on the nonnegative integers. Such distributions are important in insurance applications since they can be used to model the number of events such as losses to the insured or claims to the insurer. Though playing a prominent role in statistical theory, the Poisson distribution is not appropriate in all situations, since it requires that the mean and the variance are equaled. Thus the negative binomial distribution is an excellent alternative to the Poisson distribution, especially in the cases where the observed variance is greater than the observed mean.

The negative binomial distribution arises naturally from a probability experiment of performing a series of independent Bernoulli trials until the occurrence of the *r*th success where *r* is a positive integer. From this starting point, we discuss three ways to define the distribution. We then discuss several basic properties of the negative binomial distribution. Emphasis is placed on the close connection between the Poisson distribution and the negative binomial distribution.

________________________________________________________________________

**Definitions**

We define three versions of the negative binomial distribution. The first two versions arise from the view point of performing a series of independent Bernoulli trials until the *r*th success where *r* is a positive integer. A Bernoulli trial is a probability experiment whose outcome is random such that there are two possible outcomes (success or failure).

Let be the number of Bernoulli trials required for the *r*th success to occur where *r* is a positive integer. Let is the probability of success in each trial. The following is the probability function of :

The idea for is that for to happen, there must be successes in the first trials and one additional success occurring in the last trial (the th trial).

A more common version of the negative binomial distribution is the number of Bernoulli trials in excess of *r* in order to produce the *r*th success. In other words, we consider the number of failures before the occurrence of the *r*th success. Let be this random variable. The following is the probability function of :

The idea for is that there are trials and in the first trials, there are failures (or equivalently successes).

In both and , the binomial coefficient is defined by

where is a positive integer and is a nonnegative integer. However, the right-hand-side of can be calculated even if is not a positive integer. Thus the binomial coefficient can be expanded to work for all real number . However must still be nonnegative integer.

For convenience, we let . When the real number , the binomial coefficient in can be expressed as:

where is the gamma function.

With the more relaxed notion of binomial coefficient, the probability function in above can be defined for all real number *r*. Thus the general version of the negative binomial distribution has two parameters *r* and , both real numbers, such that . The following is its probability function.

Whenever *r* in is a real number that is not a positive integer, the interpretation of counting the number of failures until the occurrence of the *r*th success is no longer important. Instead we can think of it simply as a count distribution.

The following alternative parametrization of the negative binomial distribution is also useful.

The parameters in this alternative parametrization are *r* and . Clearly, the ratio takes the place of in . Unless stated otherwise, we use the parametrization of .

________________________________________________________________________

**What is negative about the negative binomial distribution?**

What is negative about this distribution? What is binomial about this distribution? The name is suggested by the fact that the binomial coefficient in can be rearranged as follows:

The calculation in can be used to verify that is indeed a probability function, that is, all the probabilities sum to 1.

In , we take . The step above uses the following formula known as the Newton’s binomial formula.

For a detailed discussion of (8) with all the details worked out, see the post called Deriving some facts of the negative binomial distribution.

________________________________________________________________________

**The Generating Function**

By definition, the following is the generating function of the negative binomial distribution, using :

where . Using a similar calculation as in , the generating function can be simplified as:

As a result, the moment generating function of the negative binomial distribution is:

For a detailed discussion of (12) with all the details worked out, see the post called Deriving some facts of the negative binomial distribution.

________________________________________________________________________

**Independent Sum**

One useful property of the negative binomial distribution is that the independent sum of negative binomial random variables, all with the same parameter , also has a negative binomial distribution. Let be an independent sum such that each has a negative binomial distribution with parameters and . Then the sum has a negative binomial distribution with parameters and .

Note that the generating function of an independent sum is the product of the individual generating functions. The following shows that the product of the individual generating functions is of the same form as , thus proving the above assertion.

________________________________________________________________________

**Mean and Variance**

The mean and variance can be obtained from the generating function. From and , we have:

Note that . Thus when the sample data suggest that the variance is greater than the mean, the negative binomial distribution is an excellent alternative to the Poisson distribution. For example, suppose that the sample mean and the sample variance are 3.6 and 7.1. In exploring the possibility of fitting the data using the negative binomial distribution, we would be interested in the negative binomial distribution with this mean and variance. Then plugging these into produces the negative binomial distribution with and .

________________________________________________________________________

**The Poisson-Gamma Mixture**

One important application of the negative binomial distribution is that it is a mixture of a family of Poisson distributions with Gamma mixing weights. Thus the negative binomial distribution can be viewed as a generalization of the Poisson distribution. The negative binomial distribution can be viewed as a Poisson distribution where the Poisson parameter is itself a random variable, distributed according to a Gamma distribution. Thus the negative binomial distribution is known as a Poisson-Gamma mixture.

In an insurance application, the negative binomial distribution can be used as a model for claim frequency when the risks are not homogeneous. Let has a Poisson distribution with parameter , which can be interpreted as the number of claims in a fixed period of time from an insured in a large pool of insureds. There is uncertainty in the parameter , reflecting the risk characteristic of the insured. Some insureds are poor risks (with large ) and some are good risks (with small ). Thus the parameter should be regarded as a random variable . The following is the conditional distribution of the random variable (conditional on ):

Suppose that has a Gamma distribution with scale parameter and shape parameter . The following is the probability density function of .

Then the joint density of and is:

The unconditional distribution of is obtained by summing out in .

Note that the integral in the fourth step in is 1.0 since the integrand is the pdf of a Gamma distribution. The above probability function is that of a negative binomial distribution. It is of the same form as . Equivalently, it is also of the form with parameter and .

The variance of the negative binomial distribution is greater than the mean. In a Poisson distribution, the mean equals the variance. Thus the unconditional distribution of is more dispersed than its conditional distributions. This is a characteristic of mixture distributions. The uncertainty in the parameter variable has the effect of increasing the unconditional variance of the mixture distribution of . The variance of a mixture distribution has two components, the weighted average of the conditional variances and the variance of the conditional means. The second component represents the additional variance introduced by the uncertainty in the parameter (see The variance of a mixture).

________________________________________________________________________

**The Poisson Distribution as Limit of Negative Binomial**

There is another connection to the Poisson distribution, that is, the Poisson distribution is a limiting case of the negative binomial distribution. We show that the generating function of the Poisson distribution can be obtained by taking the limit of the negative binomial generating function as . Interestingly, the Poisson distribution is also the limit of the binomial distribution.

In this section, we use the negative binomial parametrization of . By replacing for , the following are the mean, variance, and the generating function for the probability function in :

Let *r* goes to infinity and goes to zero and at the same time keeping their product constant. Thus is constant (this is the mean of the negative binomial distribution). We show the following:

The right-hand side of is the generating function of the Poisson distribution with mean . The generating function in the left-hand side is that of a negative binomial distribution with mean . The following is the derivation of .

We now focus on the limit in the exponent.

The middle step in uses the L’Hopital’s Rule. The result in is obtained by combining and .

________________________________________________________________________

**Reference**

- Klugman S.A., Panjer H. H., Wilmot G. E.
*Loss Models, From Data to Decisions*, Second Edition., Wiley-Interscience, a John Wiley & Sons, Inc., New York, 2004

________________________________________________________________________

Question : in your equation (7), you exatract (-1) from the product and it gets the exponent x, as if there were x factors in your product. What allows you to think there are x factors, how could you know that ?

I understand you extracted the (-1) by the following process :

Product(c*x_i, i=1..n) = c^n * Product(x_i, i=1..n)

But if we write the product in the Pi notation, we get :

Product((-1)(-i),i=r..(x+r-1)), so the result would be :

(-1)^(x+r-1) * Product((-1),i=r..(x+r-1))

The only way I could imagine having a coefficient of (-1)^x is to extract it from x! = Product(i, i=1..x) = Product((-1)(-i), i=1..x) = (-1)^x Product((-i), i=1..x)

then we would get a coefficient of (-1)^(-x) = (-1)^x

but it would change all the equation

Thank you very much! This is the first time I really understand this distribution.