The people in a country are partitioned into clans. In order to estimate the average size of a clan, a survey is conducted where $1000$ randomly selected people are asked to state the size of the clan to which they belong. How does one compute an estimate average clan size from the data collected?
Source: puzzledquant.com
My approach: I am thinking of using $E[X]$=$E[E[X|N]]$ where $X$ is size and $N$ is the clan I am currently in. But I am unsure how to proceed from here. Help.
Answer
$$n/\sum_{j=1}^n \frac{1}{k_j}$$ Where
How to figure this out
Average clan size true value is
$$\frac{\sum_{i=1}^M k_i}{M}=\frac{N}{M}$$ where
If we just sum the survey answers for the whole population we will get
$$\sum_{j=1}^N k_j=\sum_{i=1}^M k_i*k_i = \sum_{i=1}^M k_j^2$$ As every group of size $k_i$ is reported precisely $k_i$ times in case of whole population.
Instead of summing it with Identity($k_i$) = $k_i$ we check if summing with f($k_i$) = 1/$k_i$ will provide something more like $\frac{N}{M}$
$$\sum_{j=1}^N f(k_i)=\sum_{j=1}^N 1/k_i=\sum_{i=1}^M k_i*1/k_i=\sum_{i=1}^M 1 =M$$
This is how we can estimate the denominator of desired fraction. The nominator is estimated using sample size as usual.