I know that I can find a polynomial regression's coefficients doing $(X'X)^{-1}X'y$ (where $X'$ is the transpose).
This is a way of finding them; now, there is (as far as I know) at least one other way, which is by minimizing a cost function using gradient descent. The former method seems to be the easiest to implement ( I did it in C++, I have the latter in Matlab ).
What I wanted to know is the advantage of one of these methods over the other. Upon a particular dataset, with very few points, I found that I couldn't find a satisfactory solution using $(X'X)^{-1}X'y$, but gradient descent worked fine and I could get an estimation function that made sense. It was 6th degree polynom though, but in this particular case I don't care overfitting the data.
So what's wrong with the matrix resolution over gradient descent ?