Bishop - Pattern Recognition & Machine Learning, Exercise 1.2

Question

Bishop - Pattern Recognition & Machine Learning, Exercise 1.2

765 Views Asked by Bumbble Comm At 25 Mar 2026 - 6:43

I'm working on exercise 1.2 (Curve Fitting Problem) of Bishop's Pattern Recognition and Machine Learning book.

You should write the linear equations, satisfied by the coefficients, that minimize the regularized sum-of-squares error function $\tilde E(w) = \frac{1}{2}\sum_{n = 1}^N (y(x_n, w) - t_n)^2 + \frac{\lambda}{2}\|w\|^2$ with $y(x, w) = \sum_{j = 0}^M w_jx^j$ and $\|w\|^2 = w^Tw$ for given data $(x_n, t_n)$.

Similar to exercise 1.1, I started with the partial derivative for the weight $w_i$ with $A_{ij} = \sum_{n = 1}^N (x_n)^{i + j}$ and $T_i = \sum_{n = 1}^N (x_n)^it_n$ from the first exercise: $$ \frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) - t_n)^2 + \lambda / 2 \|w\|^2) \\ = \frac{\partial \tilde E}{\partial w_i}(0.5 \sum_{n = 1}^N (y(x_n, w) - t_n)^2) + \frac{\partial \tilde E}{\partial w_i}( \lambda / 2 \|w\|^2) \\ = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda / 2 \frac{\partial \tilde E}{\partial w_i}(\|w\|^2) \\ = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda / 2 \cdot 2w_i = \sum_{j = 0}^M A_{ij}w_j - T_i + \lambda w_i $$

Now I could set this derivative to zero and get $$ \sum_{j = 0}^M (A_{ij}w_j) + \lambda w_i = T_i $$

And at this point I've no idea, how to get this equation in a form of a linear system for using for example the gauss algorithm.

Thanks for any help.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 06 Feb 2020 - 9:50

As far as the math goes, I think there is a slight mistake with regards to the derivative of the penalty term. Substituting the index representation of the penalty term before taking the derivative helps to illustrate.

$$ \begin{align} \frac{\partial \tilde E}{\partial w_i} &= \frac{\partial}{\partial w_i}\left( \frac{1}{2} \sum_{n = 1}^N \left[ y(x_n, \mathbf{w}) - t_n \right]^2 + \frac{\lambda}{2} \|w\|^2 \right) \\ &= \frac{\partial}{\partial w_i} \left( \frac{1}{2} \sum_{n = 1}^N \left[ y(x_n, \mathbf{w}) - t_n \right]^2 \right) + \frac{\partial}{\partial w_i} \left( \frac{\lambda}{2} \|w\|^2 \right) \\ &= \sum_{j=0}^M A_{ij}w_j - T_i + \frac{\lambda}{2} \sum_{j=0}^M \frac{\partial w_j^2}{\partial w_i} \\ &= \sum_{j=0}^M A_{ij}w_j - T_i + \lambda \sum_{j=0}^M w_j \delta_{ij} \\ &= \sum_{j=0}^M \left( A_{ij} + \lambda \delta_{ij} \right) w_j - T_i \end{align} $$

where $\delta_{ij}$ is the Kronecker delta function and corresponds to the identity matrix.

**Bumbble Comm** · Accepted Answer

Bumbble Comm On 25 Sep 2018 - 2:22 BEST ANSWER

I think you are there. Maybe slightly rewriting your last equation helps:

$$ A_{i1} w_1 + A_{i2} w_2 + ... + (A_{ii} + \lambda) w_i + ... + A_{iM} w_M = T_i $$

and you have $M$ of these equations.

Bishop - Pattern Recognition & Machine Learning, Exercise 1.2

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in TRANSFORMATION

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions