The capture-recapture method is one of the methods for estimating the size of wildlife populations and is based on the hypergeometric distribution. Recall that the hypergeometric distribution is a three-parameter family of discrete distributions and one of the parameters, denoted by in this post, is the size of the population. We show that the estimate for the parameter that is obtained from the capture-recapture method is the value of the parameter that makes the observed data “more likely” than any other possible values of . Thus, the capture-recapture method produces the maximum likelihood estimate of the population size parameter of the hypergeometric distribution.
Let’s start with an example. In order to estimate the size of the population of bluegills (a species of fresh water fish) in a small lake in Missouri, a total of bluegills are captured and tagged and then released. After allowing sufficient time for the tagged fish to disperse, a sample of bluegills were caught. It was found that bluegills in the sample were tagged. Estimate the size of the bluegill population in this lake.
Let be the size of the bluegill population in this lake. The population proportion of the tagged bluegills is . The sample proportion of the tagged bluegills is . In the capture-recapture method, the population proportion and the sample proportion are set equaled. Then we solve for .
Now, the connection to the hypergeometric distribution. After bluegills were captured, tagged and released, the population is separated into two distinct classes, tagged and non-tagged. When a sample of bluegills were selected without replacement, we let be the number of bluegills in the sample that were tagged. The distribution of is the hypergeometric distribution. The following is the probability function of .
In the hypergeometric distribution described here, the parameters and are known ( and ). We now show that the estimate of is the estimate that makes the observed value of “most likely” (i.e. the estimate of is a maximum likelihood estimate of ). To show this, we consider the ratio of the hypergeometric probabilities for two successive values of .
Note that or if and only if the following holds:
Note that is the estimate from the capture-recapture method. It is also an upper bound for the population size such that the probability is greater than . This implies that the maximum likelihood estimate of is achieved when the estimate is .
As an illustration, we compute the probabilities for several values of above and below . The following matrix illustrates that the maximum likelihood is achieved at .