Finding the maximum likelihood estimate of N for a hypergeometric distribution

921 Views Asked by At

A census in the United States is an attempt to count everyone in the country. It is inevitable that many people are not counted. The U. S. Census Bureau proposed a way to estimate the number of people who were not counted by the latest census. Their proposal was as follows: In a given locality, let N denote the actual number of people who live there. Assume that the census counted $n_1$ people living in this area. Now, another census was taken in the locality, and $n_2$ people were counted. In addition, $ n_{12}$ people were counted both times.

b ) Now assume that $X = n_{12}$. Find the value of N which maximises the expression in part (a). Hint: Consider the ratio of the expressions for successive values of N.

Given (from part a) $h(N,n_1,n_2,n_{12}) = \frac{\binom{n_1}{n_{12}}\binom{N-n_1}{n_2-n_{12}}}{\binom{N}{n_2}}$

I found the ratio of $\frac{h(N+1,n_1,n_{12})}{h(N,n_1,n_{12})}$ and belive that N is maximum would be the first or smallest number in which that fraction becomes less than 1:

$\frac{\binom{N+1-n_1}{n_2-n_{12}}\binom{N}{n_2}}{\binom{N+1}{n_2}\binom{N-n_1}{n_2-n_{12}}} \leq 1$

$\frac{(N+1-n_1)(N+1-n_2)}{(N+1)(N+1-n_1-n_2+n_{12})} \leq 1$

I found from simplifying this expression that $N \geq \frac{n_1n_2-n{12}}{n{12}}$ so setting that inequality to a equality should give the result.

However, the solution here has $N = \frac{n_1n_2}{n_{12}}$ I'm not sure where I went wrong.

1

There are 1 best solutions below

0
On BEST ANSWER

Both answers are in a sense correct, taking account of rounding and the possibility that there may be two maximum likelihood estimates.

Taking your calculations, you have said that $\hat{N} \geq \frac{n_1n_2-n_{12}}{n_{12}}$, and this is $\frac{n_1n_2}{n_{12}}-1$

  • so if the right-hand side is not an integer then you need round up your answer, which is also rounding down the book's $\frac{n_1n_2}{n_{12}}$
  • while if the right-hand side is an integer then your likelihood ratio is equal to $1$, making both your $\hat{N}$ and $\hat{N}+1$ maximum likelihood estimates, i.e. your answer and the book's answer are different but both give the same maximum likelihood

With $n_1=3, n_2=3, n_{12}=2$ the maximum likelihood happens with $N=4$ which is your answer rounded up and the book's answer rounded down.

With $n_1=4, n_2=3, n_{12}=2$ the maximum likelihood happens twice, with $N=5$ which is your answer and with $N=6$ which is the book's answer