Show weighted estimator is unbiased when sampling from a finite population

63 Views Asked by At

Given a finite population of $N$ units {$(U_1, X_1),\ldots,(U_N, X_N)$}, with $X_i$ being the observation associated with the $i$-th unit $U_i$, consider estimating $$\DeclareMathOperator{\Var}{Var} \mu=\frac 1N \sum_{i=1}^N X_i, $$ the population mean.

We can find the variance of the sample mean $\bar{X}$ based on a simple random sample of size $n$ from the population, given by $$ \Var\bar{X}=\frac{N-n}{N-1} \frac{\sigma^2}{n}, $$ where $\sigma^2=\frac 1N \sum_{i=1}^N (X_i -\mu))^2$ is the population variance.

I know how to prove this. But I don't know how to solve the following problems:

For a sampling design where $\pi_i$ is the probability of the ith unit being included in a sample, show the following:

  1. $\delta_S=\sum_{i \in S} X_i/\pi_i$ is unbiased for $N\mu$,

  2. The variance of $\delta_S$ has the following expression $$ \Var(\delta_S)=\sum_{i=1}^N X_i^2 \left(\frac{1}{\pi_i}-1\right) + \sum_{i=1}^N \sum_{j(\neq j)=1}^N X_i X_j \left(\frac{\pi_{ij}}{\pi_i \pi_j}-1\right), $$ where $\pi_{ij}$ is the probability of both $X_i$ and $X_j$ being included in a sample.

My try: I started to calculate the expectation of $\delta_S$ and I got the following result $$ \begin{split} E(\delta_S) & = E\left(\sum_{i \in S} X_i/\pi_i\right) =\sum_{i \in S} E(X_i/\pi_i) \\ & =\sum_{i \in S} \frac{1}{\pi_i} E(X_i) =\sum_{i \in S} \frac{1}{\pi_i} \mu =\mu \sum_{i \in S} \frac{1}{\pi_i}. \end{split} $$ Unfortunately, this is not $N\mu$.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $Z_i$ be $1$ if unit $i$ is included in the sample, and $0$ otherwise. Note that $P(Z_i=1)=E[Z_i]=\pi_i$.

Another tricky thing is that the $X_i$ here are considered non-random, since you are given a fixed population of size $N$ from which you are sampling. [For instance, note that the expression for the variance of $\delta_S$ is given in terms of $X_i$.]

The statistic $\delta_S$ can be rewritten as $\sum_{i=1}^N Z_i X_i / \pi_i$. Then $$E[\delta_S] = \sum_{i=1}^n E[Z_i X_i / \pi_i] = \sum_{i=1}^N E[Z_i] X_i/\pi_i = \sum_{i=1}^N X_i = N\mu.$$

Note that the last step comes directly from the definition of $\mu := \frac{1}{N} \sum_{i=1}^N X_i$.


For random variables $U$ and $V$ we have $\text{Var}(U+V) = \text{Var}(U) + \text{Var}(V) + \text{Cov}(U, V) + \text{Cov}(V, U)$. Generalizing this to a sum of $N$ random variables, we have \begin{align} \text{Var}(\delta_S) &= \text{Var}\left(\sum_{i=1}^N Z_i X_i / \pi_i\right) \\ &= \sum_{i=1}^N \text{Var}(Z_i X_i / \pi_i) + \sum_{i=1}^N \sum_{j \ne i} \text{Cov}(Z_i X_i / \pi_i, Z_j X_j / \pi_j). \end{align}

We have $$\text{Var}(Z_i) = E[Z_i^2] - E[Z_i]^2 = \pi_i - \pi_i^2$$ and $$\text{Cov}(Z_i, Z_j) = E[Z_i Z_j] - E[Z_i]E[Z_j] = \pi_{ij} - \pi_i \pi_j.$$ Thus, continuing from above, \begin{align} \text{Var}(\delta_S) &= \sum_{i=1}^N \frac{X_i^2}{\pi_i^2} (\pi_i - \pi_i^2) + \sum_{i=1}^N \sum_{j \ne i} \frac{X_i X_j}{\pi_i \pi_j}(\pi_{ij} - \pi_i \pi_j), \end{align} where we have used the linearity rules $\text{Var}(cU) = c^2 \text{Var}(U)$ and $\text{Cov}(aU, bV) = ab\text{Cov}(U, V)$ for variance/covariance.