Question
I am not from a statistics background I came across the following corollary of Hoeffding’s Inequality and couldn't find the derivation or proof for it so that could anyone please share some reference of proof or just prove it here?
Hoeffding’s Inequality
Let $Y_1, Y_2, \cdots, Y_n$ be i.i.d. observations such that $\mathbb{E}(Y_i) = \mu$ and $a \leq Y_i \leq b$. Then for any $\epsilon > 0$; $$ \mathbb{P}( | \bar{Y_n} - \mu | \geq \epsilon) \leq 2 e^{-2 n \epsilon^2 / (b - a)^2} $$
Corollary of Hoeffding’s Inequality
Let $Y_1, Y_2, \cdots, Y_n$ are independent with $\mathbb{P}( a \leq Y_i \leq b ) = 1$ and common mean $\mu$, then, with probability at least $1 - \delta$,
$$ | \bar{Y_n} - \mu | \leq \sqrt{ \frac{c}{2n} \log{(\frac{2}{\delta})}} $$
References
- Theorem 6 and Corollary 7 in this pdf
Notice that the inequality below states that you can upper bound the two-sided tail probability that the sample mean $\bar{Y}$ deviates from the theoretical mean $\mu$ by more than $\epsilon$ in terms of some exponential function.
$$\mathbb{P}( | \bar{Y_n} - \mu | \geq \epsilon) \leq 2 e^{-2 n \epsilon^2 / (b - a)^2}.$$
Via complementary events, that this means that
$$\mathbb{P}( | \bar{Y_n} - \mu | \leq \epsilon) \geq 1 - 2 e^{-2 n \epsilon^2 / (b - a)^2}.$$
We now want to set the right hand side equal to $(1 - \delta)$ and request the value of $\epsilon = \epsilon(\delta)$ that will make the above hold. So setting
$$\delta:= 2 e^{-2 n \epsilon^2 / (b - a)^2},$$
and solving for $\epsilon$, we have
$$\log \left( \frac{\delta}{2} \right) = \left( \frac{-2n\epsilon^2}{(b-a)^2} \right) \implies \epsilon = \sqrt{ \frac{c}{2n} \log{\left(\frac{2}{\delta} \right)}}$$
where $c = (b-a)^2$. This gives the result
$$\mathbb{P} \left( | \bar{Y_n} - \mu | \leq \sqrt{ \frac{c}{2n} \log{\left(\frac{2}{\delta} \right)}} \right) \geq 1 - \delta.$$
In statistical learning theory, $\epsilon$ is often interpreted as accuracy and $1 - \delta$ referred to as confidence.