Concentration inequality applied for robust estimation of the mean

Question

Concentration inequality applied for robust estimation of the mean

1.7k Views Asked by Bumbble Comm At 10 May 2026 - 6:26

Problem: (Page 19 in "Vershynin, Roman (2018). High-Dimensional Probability. Cambridge University Press. ISBN 9781108415194.")

Suppose we want to estimate the mean µ of a random variable $X$ from a sample $X_1 , \dots , X_N$ drawn independently from the distribution of $X$. We want an $\varepsilon$-accurate estimate, i.e. one that falls in the interval $(\mu − \varepsilon, \mu + \varepsilon)$.

Show that a sample of size $N = O( \log (\delta^{−1} )\, \sigma^2 / \varepsilon^2 )$ is sufficient to compute an $\varepsilon$-accurate estimate with probability at least $1 −\delta$.

Hint: Use the median of $O(log(\delta^{−1}))$ weak estimates.

It is easy to use Chebyshev's inequality to find a weak estimate of $N = O( \sigma^2 / (\delta \varepsilon^2) )$.

However, I do not how to find inequality about their median. The wikipedia of median (https://en.wikipedia.org/wiki/Median#The_sample_median) says sample median asymptotically normal but this does not give a bound for specific $N$. Any suggestion is welcome.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

As you mentioned, using Chebyshev's inequality, we can easily prove that $N = O(\frac{\sigma^2}{\epsilon^2}) $ samples are enough for an $\epsilon$-accurate estimate of the mean with probability $\frac{3}{4}$.

Now asume that we have $k$ estimates $(\hat{\mu}_1, \hat{\mu}_2, ..., \hat{\mu}_k)$ that each of the are $\epsilon$-accurate. Each of them need $N = O(\frac{\sigma^2}{\epsilon^2}) $ samples. Let $X_i = 1(|\hat{\mu}_i - \mu|>\epsilon)$, i.e. the indicator of wrong answers. $X_i$s are iid Bernoulli random variables with p = $\frac{1}{4}$.

Let $\hat{\mu} = \mathrm{med}(\hat{\mu}_1, \hat{\mu}_2, ..., \hat{\mu}_k)$:

\begin{equation} P(|\hat\mu - \mu|>\epsilon) = P(\sum_{i = 1}^{k}X_i>\frac{k}{2}) = P(\sum_{i = 1}^{k}(X_i-\frac{1}{4})>\frac{k}{4}) \end{equation}

Using the general Hoeffding inequality for bounded random variables, Theorem 2.2.6 of Vershynin, Roman (2018). High-Dimensional Probability, we can write

$$P(\sum_{i = 1}^{k}(X_i-\frac{1}{4})>\frac{k}{4})<\exp(-\frac{k}{8})$$ thus $$P(|\hat\mu - \mu|>\epsilon) < \exp(-k/8) = \delta \rightarrow k = O(log(\delta^{-1}))$$

So by having $O\big(log(\delta^{-1})\frac{\sigma^2}{\epsilon^2}\big)$ samples, we can have an $\epsilon$-accurate estimate with probability $1-\delta$.

Concentration inequality applied for robust estimation of the mean

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in MEDIAN

Trending Questions

Popular # Hahtags

Popular Questions