I am reading Understanding Machine Learning: From Theory to Algorithms and I don't know how to prove the second part of their Lemma B.10 (page 427).
Specifically, let $S=\{(x_i,y_i)\}_{i=1}^m$ be a sample that is drawn i.i.d from some distribution $D$ over $\mathcal{X}\times\mathcal{Y}.$ Let $l:\mathcal{Y}\times\mathcal{Y}\to[0,1]$ be a loss function. Let $L_D(h)=\mathbb{E}_{(x,y)\sim D}[l(h(x),y)]$ be the risk of $h$ over $D,$ and $L_S(h)=\frac{1}{m}\sum_{i=1}^m l(h(x_i),y_i)$ be the empirical error of $h$ over $S.$
My question is, for a fixed $h,$ how to derive a bound with the following form by the Bernstein's inequality? $$\mathbb{P}\left\{|L_D(h)-L_S(h)|\geq C_1\sqrt{\frac{L_S(h)\ln{(1/\delta)}}{m}}+C_2\frac{\ln{(1/\delta)}}{m}\right\}<\delta.$$