unbiased estimator of sample mean

50 Views Asked by At

The question: Given a random sample $X_1,...,X_n$ show that $\frac{1}{n}\sum_{i=1}^n X_i$ is an unbiased estimator for $E(X_1)$.

My confusion: Given a statistical model $(\Omega,\Sigma,p_{\theta})$, where $p_{\theta}$ is a parameterized collection of probability measures, We define $E_{\theta}(X)=\int_{\Omega}X(\omega)p_{\theta}(d\omega)$, where $X:\Omega\to\mathbb{R}$. If $\frac{1}{n}\sum_{i=1}^nX_i$ is unbiased, we need $E_{\theta}(\frac{1}{n}\sum_{i=1}^nX_i)=\theta$, when $\theta=E(X_1)$. But what is $E(X_1)$? With respect to what measure are we integrating, if we're considering all these different measures on the space?

For any $\theta$, $E_{\theta}(\frac{1}{n}\sum_{i=1}^nX_i)=E_{\theta}(X_1)$, because each $X_i$ is identically distributed. My guess is, you would say something like, "for a fixed probability measure $p_{\varphi}$ on $(\Omega,\Sigma)$, $\frac{1}{n}\sum_{i=1}^nX_i$ is an unbiased estimator for $E_{\varphi}(X_1)$." Is this right?

1

There are 1 best solutions below

1
On BEST ANSWER

Let $(\mathcal X,\mathcal F, \mathcal P)$ be a statistical model, i.e., $(\mathcal X,\mathcal F, P)$ is a probability space for every $P\in\mathcal P$. A statistic $T:\mathcal X\rightarrow \mathbb R$ is called $\mathcal P$-unbiased for $\vartheta$ iff $$\int_{\mathcal X}T(x)\,\mathrm dP(x) = \vartheta(P)$$ for each $P\in\mathcal P$; see Definition Definition 1.1 in Chapter 2 of Theory of Point Estimation by Lehman and Casella. Note that $\vartheta$ is a statistical functional, i.e. a function mapping $\mathcal P$ into $\mathbb R$.

For the parametric version, it is common to replace $\mathcal P$ by a parametric family indexed by $\Theta$ and write $\theta$ instead of $P_\theta$ (on the right hand side) for notional convienence.


Example: Consider the parametric statistical model $\left(\mathbb R^n, \mathcal B(\mathbb R^n), \{P_\theta : \theta\in\mathbb R\}\right)$, where $P_\theta:=\bigotimes_{i=1}^n\mathcal N(\theta, 1)$ denotes the product measure associated with $n$ iid $\mathcal N(\theta,1)$-distributed random variables. Our goal is to investigate $\{P_\theta : \theta\in\mathbb R\}$-unbiasedness of the statistic $T:\mathbb R^n\rightarrow\mathbb R$ given by $T(\boldsymbol x) = n^{-1}\sum_{i=1}^nx_i$ for the statistical functional $\vartheta$ given by $$\vartheta(Q) = \int_{\mathbb R^n} \pi_1(\boldsymbol x)\,\mathrm dQ(\boldsymbol x),$$ where $\pi_1$ is the projection onto the first coordinate of $\mathbb R^n$. Note that this is exactly the statistical functional you are intested in. Then, applying properties of pushforward measure + properties of the product measure + definition of expected value, it holds that $$\vartheta(P_\theta) = \int_{\mathbb R^n} \pi_1(\boldsymbol x)\,\mathrm dP_\theta(\boldsymbol x) = \int_{\mathbb R} x_1\,\mathrm d(\pi_1\#P_\theta)(x_1) = \int_{\mathbb R} x_1\,\mathrm d\mathcal N(\theta, 1)(x_1) = \theta$$ for each $\theta\in\mathbb R$. On the other hand, using linearity of $\int$ + properties of pushforward measures + properties of the product measure (i.e., the iid assumption) + definition of expected value, it follows that $$\int_{\mathbb R^n}T(\boldsymbol x)\,\mathrm dP_\theta(\boldsymbol x) = n^{-1}\sum_{i=1}^n\int_{\mathbb R^n}\pi_i(\boldsymbol x)\,\mathrm dP_\theta(\boldsymbol x) = \int_{\mathbb R}x\,\mathrm d\mathcal N(\theta, 1)(x) = \theta$$ for each $\theta\in\mathbb R$. That is, $$\int_{\mathbb R^n} T(\boldsymbol x)\,\mathrm dP_\theta(\boldsymbol x) = \vartheta(P_\theta)$$ for each $\theta\in\mathbb R$, i.e., $T$ is $\{P_\theta:\theta\in\mathbb R\}$-unbiased for $\vartheta$.