I am trying to understand the proof of Theorem 3 in the paper "A Universal Law of Robustness via isoperimetry" by Bubeck and Sellke.
Basically there exist atleast one $w_{L,e}$ in $\mathcal{W}_{L,e}$ for which there is another $w_{L}$ in $\mathcal{W}_{L}$ $\frac{\epsilon}{6j}$ apart.
And this this makes clear, $$||f_{w_{L,\epsilon}} - f_{w_{L}} ||_{\infty} = J * \epsilon/ 6J = \epsilon/6 \cdot \cdot \cdot\cdot\cdot\cdot\cdot\cdot\cdot (a) $$
using this assumption $1$
$$\boxed{ \left\|f_{\boldsymbol{w}_{1}}-f_{\boldsymbol{w}_{2}}\right\|_{\infty} \leq J\left\|\boldsymbol{w}_{1}-\boldsymbol{w}_{2}\right\|} $$
This equation (a) denotes that those two functions are close apart.
How can I prove empirical loss of $f_{w_{L,\epsilon}}$ and $f_{w_{L}}$ are also close? Here the label is $y$
My thoughts:- Let,
$p_{1}$ = $(y_{i} - f_{w_{L,\epsilon}}(x_{i}))$ and $p_{2}$ = $(y_{i} - f_{w_{L}}(x_{i}))$
Has to proof:- $p_{2}^2$ < $p_{1}^2$ + (some constant)
then I can conclude from here that $EL(f_{w_{L}}) \leq EL(f_{w_{L,\epsilon}})$ + (some constant)
If this holds I can say that $EL(f_{w_{L}})$ being small implies $EL(f_{w_{L,\epsilon}})$ can't be too big
But How can I proof
$$\boxed{p_{2}^2 \leq p_{1}^2 + C }$$
To Prove:- $$\boxed{p_{2}^2 \leq p_{1}^2 + C }$$
Given :- $$p_{1} = (y_{i} - f_{w_{L,\epsilon}}(x_{i}))$$ and $$p_{2} = (y_{i} - f_{w_{L}}(x_{i}))$$
$$p_{2}^2 - p_{1}^2 = (f_{w_{L}}(x_{i})^2 - f_{w_{L,\epsilon}}(x_{i})^2) + 2 y_{i} (f_{w_{L,\epsilon}}(x_{i}) - f_{w_{L}}(x_{i})) \leq$$ $$[(f_{w_{L,\epsilon}}(x_{i}) + f_{w_{L}}(x_{i}))] [|f_{w_{L,\epsilon}}(x_{i}) - f_{w_{L}}(x_{i})|] + 2 y_{i} \frac{\epsilon}{6} \leq$$
$$[(f_{w_{L,\epsilon}}(x_{i}) + f_{w_{L}}(x_{i}))][\frac{\epsilon}{6}] + 2 y_{i} \frac{\epsilon}{6} \leq Constant $$
Hence Proved.