In the paper Information-Theoretic Determination of Minimax Rates of Convergence the authors present Theorem 3 as follows:
If $M_2(\epsilon)$ is the $\ell_2$ packing entropy of a density class $\mathscr{F}$ and if $\epsilon_n$ satisfies the condition $M_2(\sqrt{2}\epsilon_n) = n\cdot \epsilon_n^2$, then \begin{eqnarray} \min_{\hat{f}}\max_{f\in\mathscr{F}} \mathbb{E}_{X^n\sim f}\left[\left|\left|f - \hat{f}\right|\right|_1\right] \leq 8\sqrt{8} \epsilon_n, \end{eqnarray} where $\hat{f}$ is an estimator of $f$ based on data $X^n$ drawn from the true distribution $f$.
I omit the remainder of the theorem since I am in particular interested in the $\ell_1$ case. In other words, Theorem 3 characterizes (upper bounds) the minimax risk in terms of the metric entropy.
Unfortunately, Theorem 3 is presented without proof. So my question is, Does anyone know if this is a well-known result (I have been unable to find it thus far)? If not, does anyone see a way to prove their claim?
Thanks.
You can find the proof on the paper!