Why $ \nabla \|g(b)\|_2=\frac{\langle g(b),\nabla g(b)\rangle}{\|g(b)\|_2} $ for any differentiable g?

68 Views Asked by At

I'm trying to understand this answer to a question I've made before. I said in that same question the following:

The gradient of $f = \lVert a \times b \rVert_2$ with respect to $b$ is apparently equivalent to $$\frac{(a \times b) \times a}{\lVert a \times b \rVert_2}$$

Actually, this is my problem, I don't understand why that is the case. So, user LutzL said that I had made the observation:

$$ \nabla \|g(b)\|_2=\frac{\langle g(b),\nabla g(b)\rangle}{\|g(b)\|_2} $$

which is actually different from my observation. What he tries to show is that they are equivalent, but I somehow don't manage to follow his explanation.

Can someone explain me why are the two equivalent? Why do we have this equality $ \nabla \|g(b)\|_2=\frac{\langle g(b),\nabla g(b)\rangle}{\|g(b)\|_2} $?

Notation: $g: \mathbb{R}^3 \to \mathbb{R}^3$, $\nabla g$ denotes the (total) derivative of $g$, i.e. a matrix. (Sometimes denoted in other sources as $Dg$.) The brackets $\langle , \rangle$ refer to matrix multiplication, not the inner product, in other notation $\langle g(b), \nabla g(b) \rangle :=: [g(b)]^T Dg(b)$.

3

There are 3 best solutions below

3
On BEST ANSWER

Set $g(b) = (g_1(b),g_2(b),g_3(b))$ and $b=(b_1,b_2,b_3)$ Then the first component of $\nabla \|g(b)\|_2$ is

$$\frac{\partial}{\partial b_1} \sqrt{g_1(b)^2+g_2(b)^2+g_3(b)^2} = \frac{g_1(b) \dfrac{\partial g_1(b)}{\partial b_1}+g_2(b) \dfrac{\partial g_2(b)}{\partial b_1}+g_3(b) \dfrac{\partial g_3(b)}{\partial b_1}}{\sqrt{g_1(b)^2+g_2(b)^2+g_3(b)^2}} $$ $$= \frac{\langle g(b), (\nabla g(b))_1\rangle}{\|g(b)\|_2}. $$

Similarly you obtain the other components, and the formula

$$\nabla \|g(b)\|_2 = \frac{\langle g(b), \nabla g(b)\rangle}{\|g(b)\|_2} $$ is proven. For $g(b) = a \times b$ you have

$$g(b) = (a_2b_3-a_3b_2,a_3b_1-a_1b_3,a_1b_2-a_2b_1) $$

so that

$$g_1(b) \dfrac{\partial g_1(b)}{\partial b_1}+g_2(b) \dfrac{\partial g_2(b)}{\partial b_1}+g_3(b) \dfrac{\partial g_3(b)}{\partial b_1} = (a_3b_1-a_1b_3)a_3+(a_1b_2-a_2b_1)(-a_2) $$

$$ = b_1(a_1^2+a_2^2+a_3^2)-a_1(a_1b_1+a_2b_2+a_3b_3) = b_1 (a \cdot a)-a_1 (a \cdot b)$$ $$=(b(a\cdot a)-a(a\cdot b))_1 = ((a\times b) \times a)_1. $$

And the rest follows.

0
On

The gradient of $\| g(b) \|^2 = \left< g(b), g(b) \right>$ can be computed by the product rule as

$$ \nabla \left( \| g(b) \|^2 \right) = \left< \nabla g(b), g(b) \right> + \left< g(b), \nabla g(b) \right> = 2 \left< g(b), \nabla g (b) \right>. $$

The gradient of $\| g(b) \| = \sqrt{ \left< g(b), g(b) \right>}$ can then be computed using the chain rule as

$$ \nabla \| g(b) \| = \frac{2 \left< g(b), \nabla g(b) \right>}{2 \sqrt{\left<g(b), g(b) \right>}} = \frac{ \left< g(b), \nabla g(b) \right>}{\| g(b) \|}. $$

Now apply this formula to $g(b) = a \times b$ noting that this is a linear map.

0
On

This is overly-detailed spoon-feeding, and is honestly as much for my own benefit as that of the OP.

I will write $$g(\cdot)= (g_1(\cdot), g_2(\cdot), g_3(\cdot))^T. $$

Then using the chain rule and some grit and determination $$||g(b) ||_2 := \sqrt{g_1(b)^2 + g_2(b)^2 +g_3(b)^2 } \\ \implies \nabla ||g(b)||_2 = \left( \frac{2g_1(b) \frac{\partial g_1}{\partial x_1}(b) +2g_2(b)\frac{\partial g_2}{\partial x_1}(b) +2g_3(b)\frac{\partial g_3}{\partial x_1}(b)}{2\sqrt{g_1(b)^2 + g_2(b)^2 +g_3(b)^2 }}, \frac{2g_1(b)\frac{\partial g_1}{\partial x_2}(b)+2g_2(b)\frac{\partial g_2}{\partial x_2}(b) + 2g_3(b)\frac{\partial g_3}{\partial x_2}(b)}{2\sqrt{g_1(b)^2 + g_2(b)^2 +g_3(b)^2 }}, \frac{2g_1(b)\frac{\partial g_1}{\partial x_3}(b)+2g_2(b)\frac{\partial g_2}{\partial x_3}(b)+2g_3(b)\frac{\partial g_3}{\partial x_3}(b)}{2\sqrt{g_1(b)^2 + g_2(b)^2 +g_3(b)^2 }}\right) \\ = \frac{1}{\sqrt{g_1(b)^2+g_2(b)^2+g_3(b)^2}}\left(g_1(b)\frac{\partial g_1}{\partial x_1}(b)+g_2(b)\frac{\partial g_2}{\partial x_1}(b) + g_3(b)\frac{\partial g_3}{\partial x_1}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_2}(b)+g_2(b)\frac{\partial g_2}{\partial x_2}(b)+g_3(b)\frac{\partial g_3}{\partial x_2}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_3}(b) + g_2(b)\frac{\partial g_2}{\partial x_3}(b)+g_3(b)\frac{\partial g_3}{\partial x_3}(b)\right)$$

On the other hand, we have that $$\nabla g(b) = \begin{bmatrix} \frac{\partial g_1}{\partial x_1}(b) & \frac{\partial g_1}{\partial x_2}(b) & \frac{\partial g_1}{\partial x_3}(b) \\ \frac{\partial g_2}{\partial x_1}(b) & \frac{\partial g_2}{\partial x_2}(b) & \frac{\partial g_2}{\partial x_3}(b) \\ \frac{\partial g_3}{\partial x_1}(b) & \frac{\partial g_3}{\partial x_2}(b) & \frac{\partial g_3}{\partial x_3}(b) \end{bmatrix} $$

Thus, we get upon matrix-multiplication: $$\langle g(b) , \nabla g(b) \rangle = \begin{bmatrix} g_1(b) & g_2(b) & g_3(b) \end{bmatrix} \begin{bmatrix} \frac{\partial g_1}{\partial x_1}(b) & \frac{\partial g_1}{\partial x_2}(b) & \frac{\partial g_1}{\partial x_3}(b) \\ \frac{\partial g_2}{\partial x_1}(b) & \frac{\partial g_2}{\partial x_2}(b) & \frac{\partial g_2}{\partial x_3}(b) \\ \frac{\partial g_3}{\partial x_1}(b) & \frac{\partial g_3}{\partial x_2}(b) & \frac{\partial g_3}{\partial x_3}(b) \end{bmatrix} = \left(g_1(b)\frac{\partial g_1}{\partial x_1}(b)+g_2(b)\frac{\partial g_2}{\partial x_1}(b) + g_3(b)\frac{\partial g_3}{\partial x_1}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_2}(b)+g_2(b)\frac{\partial g_2}{\partial x_2}(b)+g_3(b)\frac{\partial g_3}{\partial x_2}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_3}(b) + g_2(b)\frac{\partial g_2}{\partial x_3}(b)+g_3(b)\frac{\partial g_3}{\partial x_3}(b)\right)$$

The claimed equality then follows from the fact that $$\frac{1}{||g(b)||_2} = \frac{1}{\sqrt{g_1(b)^2+g_2(b)^2+g_3(b)^2}} $$ i.e. thus we have that $$ \frac{\langle g(b), \nabla g(b) \rangle}{||g(b)||_2} \\ = \frac{1}{\sqrt{g_1(b)^2+g_2(b)^2+g_3(b)^2}} \left(g_1(b)\frac{\partial g_1}{\partial x_1}(b)+g_2(b)\frac{\partial g_2}{\partial x_1}(b) + g_3(b)\frac{\partial g_3}{\partial x_1}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_2}(b)+g_2(b)\frac{\partial g_2}{\partial x_2}(b)+g_3(b)\frac{\partial g_3}{\partial x_2}(b), \quad g_1(b)\frac{\partial g_1}{\partial x_3}(b) + g_2(b)\frac{\partial g_2}{\partial x_3}(b)+g_3(b)\frac{\partial g_3}{\partial x_3}(b)\right) \\ = \nabla || g(b)||_2$$