$ \min_w \frac{1}{2N} (y_n - x_n^Tw)^2 + \lambda ||w||^2 $
$ \frac{d}{dw} = \frac{1}{N} \sum_{n=1}^N (y_n - x_n^Tw)x_n + 2\lambda w $
$ w = (X^TX + \lambda 2N I)^{-1} X^Ty $
How do I go from line 2 to 3 ? How do I change from a vector to a matrix ? I can only derive until line 2.
It might help to view the original objective function in terms of matrices and vectors.
Your original objective function is missing a summation, it should read:
$$ \frac{1}{2N} \sum_{n=1}^{N} (y_n - \mathbf{x_n}^T \mathbf{w})^2 + \lambda ||\mathbf{w}||^2 $$ I've also made the vectors bold in the above expression.
This expression can be written using vectors and matrices. Let $\mathbf{y}$ be the vector with components $y_{n}$, $1 \le n \le N$.
Let $X$ be the data matrix whose $n^{th}$ column is $\mathbf{x_{n}}$. The vector of coefficients is $\mathbf{w}$
The summation in the objective function is the norm of a vector so we can write the function as follows:
$$ \frac{1}{2N} \left\lVert \mathbf{y} - X^{T} \mathbf{w} \right\rVert^{2} + \lambda ||\mathbf{w}||^2 $$
Which can be written as $$ \frac{1}{2N} \left( \mathbf{y} - X^{T} \mathbf{w} \right)^{T} \left( \mathbf{y} - X^{T} \mathbf{w} \right) + \lambda \mathbf{w}^{T} \mathbf{w} $$
Expanding: $$ \frac{1}{2N} \left[\; \mathbf{y}^{T} \mathbf{y} - \mathbf{y}^{T} X^{T} \mathbf{w} - \mathbf{w}^{T} X \mathbf{y} + \mathbf{w}^{T} X X^{T} \mathbf{w} \;\right] + \lambda \mathbf{w}^{T} \mathbf{w} $$
We have $\mathbf{y}^{T} X^{T} \mathbf{w} = \mathbf{w}^{T} X \mathbf{y}$ so we can write: $$ \frac{1}{2N} \left[\; \mathbf{y}^{T} \mathbf{y} - 2 \mathbf{w}^{T} X \mathbf{y} + \mathbf{w}^{T} X X^{T} \mathbf{w} \;\right] + \lambda \mathbf{w}^{T} \mathbf{w} $$
Differentiating with respect to $\mathbf{w}$ gives $$ \begin{aligned} & \frac{1}{2N} \left[\; - 2 X \mathbf{y} + 2 X X^{T} \mathbf{w} \;\right] + 2 \lambda \mathbf{w} \\ =& - \frac{1}{N} X \mathbf{y} + \frac{1}{N} X X^{T} \mathbf{w} + 2 \lambda \mathbf{w} \end{aligned} $$
Setting this to zero $$ \begin{aligned} - \frac{1}{N} X \mathbf{y} + \frac{1}{N} X X^{T} \mathbf{w} + 2 \lambda \mathbf{w} &= 0 \\ X X^{T} \mathbf{w} + 2 N \lambda \mathbf{w} &= X \mathbf{y} \\ \left( X X^{T} + 2 N \lambda I \right) \mathbf{w} &= X \mathbf{y} \end{aligned} $$
To give finally $$ \mathbf{w} = \left( X X^{T} + 2 N \lambda I \right)^{-1} X \mathbf{y} $$
I'm not sure why I end up with a slightly different expression from your one. Please let me know if you can see why.