Derivative of Least Squares Matrix Term

44 Views Asked by At

We have $\frac{\partial}{\partial \theta} (y - B\theta)^T(y-B\theta) = -2B^T(y-B\theta)$, where $y,\theta$ are vectors and $B$ is a matrix. How is this derived?

1

There are 1 best solutions below

2
On BEST ANSWER

Hint:

Write out $(\mathbf{y}-B\boldsymbol{\theta})^T(\mathbf{y}-B\boldsymbol{\theta)}=(\mathbf{y^T}-(B\boldsymbol{\theta})^T)(\mathbf{y}-B\boldsymbol{\theta}) = (\mathbf{y}^T-\boldsymbol{\theta}^TB^T)(\mathbf{y}-B\boldsymbol{\theta})$, where I've used the fact that transpose is linear and $(AB)^T=B^TA^T$.

Then, use distributive laws: $(\mathbf{y}^T-\boldsymbol{\theta}^TB^T)(\mathbf{y}-B\boldsymbol{\theta})=\mathbf{y}^T\mathbf{y}-\mathbf{y}^TB\boldsymbol{\theta}-\boldsymbol{\theta}^TB^T\mathbf{y}-\boldsymbol{\theta}^TB^TB\boldsymbol{\theta}$.

To calculate the derivative of the above expression w.r.t. $\theta$, use the fact that:

$$\frac{\partial(\mathbf{b}^TA\mathbf{x})}{\partial \mathbf{x}}= \mathbf{b}^TA $$

$$ \frac{\partial (\mathbf{x}^TA^T\mathbf{b})}{\partial \mathbf{x}}=A^T\mathbf{b} $$

Also, you'll have to make use of the product rule for the last term:

$$\frac{\partial(u(\mathbf{x})v(\mathbf{x}))}{\partial \mathbf{x}} = \frac{\partial u(\mathbf{x})}{\partial\mathbf{x}}v(\mathbf{x})+u(\mathbf{x})\frac{\partial v(\mathbf{x})}{\partial \mathbf{x}} $$

Of course, the differentiation here is w.r.t $\boldsymbol{\theta}$. Can you take it from here?