How did Fourier come up with the following equation?
$$f(x) = \frac{a_0}{2}+\sum_{n=1}^\infty \left(a_n \cos(\frac{2\pi}{T}nx) + b_n \sin(\frac{2\pi}{T}nx)\right)$$
Did he use the Gram-Schmidt process, which says that any function can be expressed as a linear combination of orthonormal basis functions?
So Gram-Schmidt is just a way to take a non-orthogonal basis and construct an orthogonal basis. This does still work in infinite dimensions (because the process constructs the $k$th orthogonal basis vector using only the first $k$ original basis vectors). However, there is no reason to do this with sinusoids because they are already orthogonal. With modern tools in hand, the first significant difficulty in the theory of Fourier series is showing that the sinusoids are complete. This can be stated in a few equivalent ways; one is that the set of finite linear combinations of sinusoids (of appropriate frequency for the domain) are dense in $L^2$. Once you have orthogonality and this density property, the series expansion follows pretty much immediately.
I am not sure how Fourier guessed that this should be the case, but these days it is not hard to look up a proof.
A related place where Gram-Schmidt is useful is in constructing the Legendre polynomials, which are a different orthogonal system on $L^2([-1,1])$. They can then be slightly modified to make an orthogonal system on $L^2([a,b])$.