I'm currently reading the Fourier analysis book and I have learned that every continuous function on the circle can be uniformly approximated by trigonometric polynomials, by using Fejer kernel.
After that, I have also read that there is a continuous function on the circle with divergent Fourier series at some point.
Then, what confuses me is that if the trigonometric polynomial uniformly approximate the given function then it must be converge to Fourier series, since Fourier series is the expression using orthonormal basis on the given Hilbert space and thus the best approximation in the mean square sense. Where am I wrong? Is the Fourier series just the best approximation in the mean square sense, not the uniform sense?
The Fourier series being divergent and the series obtained using the Fejer kernel being convergent does not contradict each other. The reason that one needs a summability kernel (Fejer, Gaussian, etc) is because in general the Fourier series is divergent. In fact, there is a non-meager set of continuous functions whose Fourier series diverges on a dense set. (Approximation in the $L^2$-sense and approximation pointwise are different notions.)
The different between a summability kernel and the Dirichlet kernel (which gives you the Fourier series) is that the former forms an approximate unit in the group algebra while the latter doesn't.