I have a problem of the following kind $J(\mathbf{x}) = \|\mathbf{y} - \mathbf{C}\mathbf{x}^2\|^2 + \lambda \mathbf{x}^T\mathbf{x}$ which has to be minimized. So it is like a Ridge regression but it contains $\mathbf{x}^2$ instead of $\mathbf{x}$ in the LS term (the penalty is still on $\mathbf{x}$ though).Can anybody help me derive its solution? I'm having some troubles when I reach deriving matrices.
I'm getting here:
$J(\mathbf{x}) = (\mathbf{y}-\mathbf{x}^T\mathbf{C}\mathbf{x})^T(\mathbf{y}-\mathbf{x}^T\mathbf{C}\mathbf{x}) + \lambda \mathbf{x}^T\mathbf{x}$
$J(\mathbf{x}) = \mathbf{y}^T\mathbf{y} - \mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{y} - \mathbf{y}^T\mathbf{x}^T\mathbf{C}\mathbf{x} + \mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}\mathbf{x} + \lambda\mathbf{x}^T\mathbf{x}$
$J(\mathbf{x}) = \mathbf{y}^T\mathbf{y} - \mathbf{y}^T\mathbf{x}^T\mathbf{C}\mathbf{x} - \mathbf{y}^T\mathbf{x}^T\mathbf{C}\mathbf{x} + \mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}\mathbf{x} + \lambda\mathbf{x}^T\mathbf{x}$
$J(\mathbf{x}) = \mathbf{y}^T\mathbf{y} - 2\mathbf{y}^T\mathbf{x}^T\mathbf{C}\mathbf{x} + \mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}\mathbf{x} + \lambda\mathbf{x}^T\mathbf{x}$
Minimizing means: $\frac{\delta J(\mathbf{x})}{\delta \mathbf{x}} = 0 $
And that is where I'm stuck... Anyone familiar with matrix derivations that can help me?
edit:
$\frac{\delta J(\mathbf{x})}{\delta \mathbf{x}} = 0 = -2\mathbf{x}^T\mathbf{C}^T\mathbf{y} - 2\mathbf{y}^T\mathbf{x}^T\mathbf{C}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}^T\mathbf{C}\mathbf{x}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}^T\mathbf{C}\mathbf{x}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}+\lambda\mathbf{x}^T+\lambda\mathbf{x}^T$
$0 = -2\mathbf{x}^T\mathbf{C}^T\mathbf{y} - 2\mathbf{y}^T\mathbf{x}^T\mathbf{C}+2\mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}+2\mathbf{x}^T\mathbf{C}^T\mathbf{x}^T\mathbf{C}\mathbf{x}+2\lambda\mathbf{x}^T$
$0 = -\mathbf{x}^T\mathbf{C}^T\mathbf{y} - \mathbf{y}^T\mathbf{x}^T\mathbf{C}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}^T\mathbf{C}\mathbf{x}+\lambda\mathbf{x}^T$
$\mathbf{x}^T\mathbf{C}^T\mathbf{y}+\mathbf{y}^T\mathbf{x}^T\mathbf{C}=\mathbf{x}^T\mathbf{C}^T\mathbf{x}\mathbf{x}^T\mathbf{C}+\mathbf{x}^T\mathbf{C}^T\mathbf{x}^T\mathbf{C}\mathbf{x}+\lambda\mathbf{x}^T$
Aaaand... I'm stuck again... And I'm not even sure my derivation is correct...