Gradient of $x \mapsto (w^tx+b)^2$

98 Views Asked by At

I have a non-zero column vector w $\in \mathbb{R^2}$ and a scalar b $\in \mathbb{R}$, so it's a function $f: \mathbb{R^n} \to \mathbb{R}$ with this definition: $f(x) = (w^tx + b)^2$, where $x$ is a column vector $\in \mathbb{R}$. Now I want to calculate the gradient of f at x $\nabla_xf(x)$. This is what I have tried:

$$ f(x) = (w^tx+b)^2 = (w^tx)^2 + 2(w^txb) + b^2$$ $$ \nabla_x f(x) = 2(w^tx) + 2(w^tb)$$ $$ \nabla_x f(x) = 2w^t (x + b) $$

I'm not sure if this is correct since $w$ is a column vector so I think a component like $ w^tw $ could be missing. Please, could you point out what I could be doing wrong?

2

There are 2 best solutions below

2
On BEST ANSWER

My attempt is the following, by using differentials:

\begin{align} f(x+ dx)&= (w^t(x+dx)+b)^2\\ &= (w^tx)^2 + (w^tdx)^2 +b^2 + 2 (w^txw^t)dx + 2 w^txb + 2 bw^tdx \\ &= (w^tx + b)^2 + 2(w^txw^t)dx + 2 bw^tdx + \mathcal{O}(dx^t dx) \\ & \simeq f(x) + \nabla_xf(x)^t dx \end{align}

So the linear term in the increment $dx$ is your gradient (tranposed), in this case:

\begin{align} \nabla_xf(x)= 2(x^t w + b)w \end{align}

0
On

As @VanBaffo answered this question, this is my attempt to validate the answer provided:

We can apply the chain rule which states $h'(x)=f'(g(x)).g'(x)$. Now, for $f(x) = (w^tx + b)^2$ we consider $u$ as $(w^tx + b)$, so: $$f'(u) = 2u = 2(w^tx + b)$$ and the derivative of u with respect to x is $w$. So the result is $$2(w^tx + b)w$$