For which parameter is $T$ unbiased?

89 Views Asked by At

Let $(x_1,\dots,x_n)$ a realisation of a random vector $(X_1,\dots ,X_n)$ with $\mathbb E[X_i]= \mu $ and $\mathbb V[X_i]= \sigma ^2 $for every $i$ and $\operatorname{Cov}(X_i,X_j)= p$ for every $ i \neq j$.

Let $$T(x_1,\dots,x_n):=h\sum_{i=1}^n x_i^2+k\left(\sum_{i=1}^nx_i\right)^2$$ an estimand for $\sigma^2$. Give values for $h \in \mathbb R$ and $k \in \mathbb R$ so that $T$ is unbiased.

I am not sure how get started to solve this problem. I do not really understand how to use the definition here https://en.wikipedia.org/wiki/Bias_of_an_estimator Help is much appreciated

2

There are 2 best solutions below

0
On BEST ANSWER

$\newcommand{\E}{\operatorname{E}}\newcommand{\v}{\operatorname{var}}\newcommand{\c}{\operatorname{cov}}$First, note that $h\sum_{i=1}^n x_i^2+k\left(\sum_{i=1}^nx_i\right)^2$ is merely a number, so it cannot be biased or unbiased, but $h\sum_{i=1}^n X_i^2 + k \left( \sum_{i=1}^n X_i\right)^2$ is a random variable, and it is an observable random variable, otherwise called a statistic, so it can be biased or unbiased for a particular quantity of interest.

\begin{align} & \E\left( h\sum_{i=1}^n X_i^2 + k\left( \sum_{i=1}^n X_i \right)^2 \right) \tag 1 \\[10pt] = {} & h\sum_{i=1}^n \E(X_i^2) + k \E\left(\left( \sum_{i=1}^n X_i \right)^2 \right) \text{ by linearity of expectation} \\[10pt] = {} & h \sum_{i=1}^n \left( \v(X_i) + \left( \E X_i \right)^2 \right) + k \left( \v\left( \sum_{i=1}^n X_i \right) + \left( \E\left( \sum_{i=1}^n X_i \right) \right)^2 \right) \\[10pt] = {} & hn(\sigma^2+\mu^2) + k\left( \v\left( \sum_{i=1}^n X_i \right) + (n\mu)^2 \right) \\[10pt] = {} & hn\sigma^2 + n(h + kn)\mu^2 + k\v\left( \sum_{i=1}^n X_i \right). \end{align}

Then we have \begin{align} & \v\left( \sum_{i=1}^n X_i \right) \\[10pt] = {} & \left( \sum_{i=1}^n \v(X_i) \right) + \left( \sum_{i=1}^n \sum_{\left( j\,:\,\begin{smallmatrix} 1\le j \le n \\ \&\ j\ne i \end{smallmatrix}\right)}^n \c(X_i,X_j) \right) \\[10pt] = {} & n\sigma^2 + n(n-1)p. \end{align}

So the expected value on line $(1)$ is $$ hn\sigma^2 + n(h + kn)\mu^2 + kn\sigma^2 + kn(n-1)p. \tag 2 $$ The problem now is to choose $h$ and $k$ so as to make line $(2)$ remain equal to $\sigma^2$ regardless of the values of $\sigma^2,$ $\mu,$ and $p.$

For this to work, line $(2)$ must be equal to $0$ when $\sigma^2=0$ and to $1$ when $\sigma^2=1.$ Thus we have \begin{align} 0 & = n(h + kn)\mu^2 + kn(n-1)p, \\ 1 & = hn + n(h + kn)\mu^2 + kn + kn(n-1)p. \end{align} Both of these are linear in $h$ and linear in $k,$ so they can be readily solved for $h$ and $k.$

0
On

You need to take the expectation value of the expression and compare to $\sigma^2$. We have $E(X_i^2) = \sigma^2+\mu^2$ and $E(X_i X_j) = \sigma^2\rho + \mu^2$ for $i\ne j.$ And then we also have $$ \left(\sum_i X_i\right)^2 = \sum_i X_i^2 + 2\sum_{i<j} X_i X_j$$

Now it should be relatively straightforward to calculate $E(T)$ by linearity. (Note that the expectation values of the individual terms don't depend on the index so you can pull them out of the sums and then the sum just amounts to a factor that is the number of terms in the sum.) So do that and set $$E(T) = \sigma^2$$ and see if you can find $h$ and $k$ so that the equality holds.

Unfortunately this problem seems to leave something ambiguous... You have one equation to solve and two degrees of freedom with which to solve it. So there are many values of $h$ and $k$ that agree. What they probably intend you to do is to solve it so that $h$ and $k$ have no explicit dependence on $\mu,$ so it ends up becoming the traditional unbiased estimator for $\sigma^2$ when you plug in $\rho=0$.

This leaves $\rho$ as the only "nuisance parameter" that you either have to know or estimate in order to calculate the estimator for $\sigma^2.$ And if you don't know it, estimating and plugging into the formula you derived adds additional noise, and can even potentially make the estimator biased after all.

Or perhaps the question included something in the fine print about $\rho$ being known and $\mu$ being unknown.