Gradient of $ \left( \mathbf{w}^T \mathbf{x} \right)^2 - 2\mathbf{w}^T \mathbf{x}$ w.r.t $\mathbf{w}$

54 Views Asked by At

That's it. So far I've tried the following but I'm not certain if I'm allowed to do that. \begin{align} \frac{d\left( \left( \mathbf{w}^T \mathbf{x} \right)^2 - 2\mathbf{w}^T \mathbf{x} \right)}{d\mathbf{w}} &= \frac{d\left( \left( \mathbf{w}^T \mathbf{x} \right)^2 - 2\mathbf{w}^T \mathbf{x} \right)}{d(\mathbf{w}^T\mathbf{x})}\cdot \frac{d(\mathbf{w}^T\mathbf{x})}{d\mathbf{w}} \\ &= \left(2\mathbf{w}^T\mathbf{x} - 2\right) \cdot \mathbf{x} \quad \text{(assuming Denominator layout)} \end{align} My question is, can I use the chain rule to go from a derivative w.r.t to a vector to a derivative w.r.t to a scalar size? (i.e $\mathbf{w}^T \mathbf{x}$)

1

There are 1 best solutions below

0
On BEST ANSWER

Your answer checks out. The proper way would have been to do it componentwise in which case you get \begin{align*} \frac{d\left((w^Tx)^2-2w^Tx)\right)}{dw_i} = (2w^Tx-2)x_i \end{align*} and thus the gradient is \begin{align*} \nabla_w\left((w^Tx)^2-2w^Tx)\right) = (2w^Tx-2)x \end{align*} as you stated correctly.