How do I simply this equation further, in order to get the parameter I'm looking for?

40 Views Asked by At

I have a machine learning model. I want to find the critical point, at which the cost function converges. So I took it's derivative, set it equal to zero. I want to simplify it and find the optimal $w$.

This is what I've done so far.

The equation is: $$\frac{\partial L}{\partial w} = -\frac{2}{n}\sum_{i=1}^n x_i (y_i - x_i^T w) = 0$$

Solving step by step like this:

multiply both sides

$$\frac{2}{n}\sum_{i=1}^n x_i (y_i - x_i^T w) = 0 * \frac{n}{2}$$

$$\sum_{i=1}^n x_i y_i - \sum_{i=1}^n x_i x_i^T w= 0$$

$$\sum_{i=1}^n x_i y_i = \sum_{i=1}^n x_i x_i^T w$$

$w$ becomes a row vector to preserve the right orientation, so I transposed it:

$$\sum_{i=1}^n y_ix_i = w^T(\sum_{i=1}^n x_i x_i^T)$$

And this is where I stopped. How do I simplify it more, in order to get the $w$ ? Please I also need some explenations of the steps you provide, in order for me to understand and put the pieces together.

EDIT:

I multiplied both sides by the equation $(\sum_{i=1}^n x_i x_i^T)^{-1}$, so we get $$(\sum_{i=1}^n x_i x_i^T)^{-1} \sum_{i=1}^n y_ix_i = w^T(\sum_{i=1}^n x_i x_i^T)(\sum_{i=1}^n x_i x_i^T)^{-1}$$

So now how do I get $w$? thanks!