I am going through some lecture notes on Learning Theory here: http://ttic.uchicago.edu/~tewari/LT_SP2008.html trying to learn about Rademacher complexities. I'm getting confused about the Rademacher complexity of linear function classes. Consider $F=\{f_w(x)=w^Tx|w\in R^d,\|w\|_2\le 1\}; \|x\|_2\le 1$. Denote the empirical Rademacher complexity of $F$ by $R_N(F)$. (N is the number of points from which this is estimated.)
Using the Dudley theorem, as in Lecture 15, Sec. 2.1, we get that $R_N(F)\le \sqrt{\frac{d}{N}}$.
However, using some elementary calculations as in Lecture 17, sec. 2.1, we get $R_N(F)\le \frac{1}{\sqrt{N}}$.
My question is, should $R_N(F)$ depend on the dimension $d$ or not? Is $\sqrt{d}$ just an artefact of the proof technique in the first case? (First I thought that the application of Dudley theorem as above was not assuming that $\|x\|_2\le 1$, but it must do because otherwise we would not have $|w^Tx|\le 1$ and so the upper limit of the integral would not be 1.)
I think the representation in Lecture 15 is a little inaccurate but the one in Lec 17 seems correct. Here is the original paper if you wanna check: http://research.microsoft.com/en-us/um/people/skakade/papers/ml/rad_paper.pdf (See section 3)