Chebyshev inequality but with normalized maximum estimation error

228 Views Asked by At

Consider random samples $X_1,X_2,...$ being identical and independent copies of $X$. Their sum $S_n=\Sigma ^n_{i=1}X_i$. The mean of $X$ is $\mu$ and variance is $\sigma^2\lt\infty$.

Regarding the estimation error of using sample mean $\bar X_n=\frac{1}{n}S_n $ to estimate the population mean $\mu$, instead of the regular Chebyshev: $Pr\{\frac{1}{n}|S_n-n\mu|\ge\epsilon\}\le\frac{\sigma^2}{n\epsilon^2}$, we consider the normalized maximum estimation error: $$Pr\{\frac{1}{n}{\max}_{1\le k\le n}|S_n-n\mu|\ge\epsilon\}$$ Let $A$ denote the event $\{{\max}_{1\le k\le n}|S_n-n\mu|\ge n\epsilon\}$. Consider a partition of $A$: $$A=\bigcup^n_{k=1}A_k, \;A_k=\{|S_k-k\mu|\ge n\epsilon,|S_i-i\mu|\lt n\epsilon, \forall\,1\le i\lt k\}$$ Prove that: $$Var[S_n]\ge\Sigma^n_{k=1}E[(S_n-n\mu)^2\,\,\,1 \,\{A_k\}]\ge\Sigma^n_{k=1}E[(S_k-k\mu)^2 \,\,\,1 \,\{A_k\}]$$

First of all, I don't understand what the $\max_{1\le k\le n}$ implies. I also don't understand how the $A_k$ can fit into the inequality.

Edit: After researching more about this. I realized I can do this: $$Var[S_n]=E[(S_n-\mu_{S_n})^2]=E[(S_n-n\mu)^2]$$ And if I partition it based on the $A$, I get $Var[S_n]=\Sigma^n_{k=1}E[(S_n-n\mu)^2\,\,\,1\,\,\{A_k\}]$ and since it is said that ${\max}_{1\le k\le n}$ then $$Var[S_n]=\Sigma^n_{k=1}E[(S_n-n\mu)^2\,\,\,1\,\,\{A_k\}]\ge \Sigma^n_{k=1}E[(S_k-k\mu)^2\,\,\,1\,\,\{A_k\}]$$

Right now my problem is, how do I make the $Var[S_n]\ge\Sigma^n_{k=1}E[(S_n-n\mu)^2\,\,\,1\,\,\{A_k\}]$ ?

1

There are 1 best solutions below

1
On BEST ANSWER

$max_{1\leq k\leq n}$ simply means the maximum value recorded among the first $n$ observations. If $X_1, \ldots , X_n$, then $X= max_{1\leq k\leq n} X_k$ is a random variable. (Why?) The max in your write up above gives stronger information than what you’ll end up with by a straight forward Chebyshev application. (PS: I would suggest to double check before posting a question, there are too many typos)

I guess your definition of the event $A$ is incorrect. It should have been $A=\{max_{1\leq k \leq n} |S_k-k\mu |\geq n\epsilon \}$ because this is set whose probability you’re trying to estimate.Also there is a small typo in definition of $A_k$; it should have been $1\leq i < k$. You can think of $A_k$’s to be the event that $S_i-i\mu$ crosses some pre-assigned threshold value (here it is $n\epsilon$) first time for $i=k$. (If you know about stopping times the concept here used is pretty much same). This way you partition your event $A$ into disjoint parts.

Also regarding your last doubt observe that, (simply because $E[(S_n-n\mu)^21_{A^{c}} ]\geq 0$) $Var {S_n}=E(S_n-n\mu)^2 \geq E[(S_n-n\mu)^21_{A}]$ and use $$1_A=\sum_{k=1}^{n} 1_{A_k}$$