I'm working on exercise 1.2 (Curve Fitting Problem) of Bishop's Pattern Recognition and Machine Learning book.
You should write the linear equations, satisfied by the coefficients, that minimize the regularized sum-of-squares error function $\tilde E(w) = \frac{1}{2}\sum_{n = 1}^N (y(x_n, w) - t_n)^2 + \frac{\lambda}{2}\|w\|^2$ with $y(x, w) = \sum_{j = 0}^M w_jx^j$ and $\|w\|^2 = w^Tw$ for given data $(x_n, t_n)$.
Similar to exercise 1.1, I started with the partial derivative for the weight $w_i$ with $A_{ij} = \sum_{n = 1}^N (x_n)^{i + j}$ and $T_i = \sum_{n = 1}^N (x_n)^it_n$ from the first exercise: $$ \frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) - t_n)^2 + \lambda / 2 \|w\|^2) \\ = \frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) - t_n)^2) + \frac{\partial \tilde E}{\partial w_i}( \lambda / 2 \|w\|^2) \\ = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda / 2 \frac{\partial \tilde E}{\partial w_i}(\|w\|^2) \\ = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda / 2 \cdot 2w_i = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda w_i $$
Now I could set this derivative to zero and get $$ \sum_{j = 0}^M (A_{ij}w_j) + \lambda w_i = T_i $$
And at this point I've no idea, how to get this equation in a form of a linear system for using for example the gauss algorithm.
Thanks for any help.
I think you are there. Maybe slightly rewriting your last equation helps:
$$ A_{i1} w_1 + A_{i2} w_2 + ... + (A_{ii} + \lambda) w_i + ... + A_{iM} w_M = T_i $$
and you have $M$ of these equations.