Proof of $\nabla\times (\nabla\times \mathbf f)=\nabla(\nabla\cdot \mathbf f)-\nabla^2\mathbf f$

1.1k Views Asked by At

There are several very important vector identities involving $\nabla$ that I struggle to understand. To give an example, in the derivation of the wave equation from maxwell's equations, the following identity is used: $$ \nabla\times (\nabla\times \mathbf f)=\nabla(\nabla\cdot \mathbf f)-\nabla^2\mathbf f. \label{1} $$ I can prove it by direct calculation, but that would be very boring and mechanical. To get a more intuitive proof, I tried this: we have $$ \nabla\times (\mathbf a\times \mathbf b)=\mathbf a(\nabla\cdot\mathbf b)-\mathbf b(\nabla\cdot\mathbf a)+(\mathbf b\cdot\nabla)\mathbf a-(\mathbf a\cdot\nabla)\mathbf b $$ Plugging in $\mathbf a=\nabla$ and $\mathbf b=\mathbf f$, we get $$ \nabla\times (\nabla\times \mathbf f)=\nabla(\nabla\cdot \mathbf f)-\mathbf f (\nabla\cdot \nabla)+(\mathbf f\cdot \nabla)\nabla-(\nabla\cdot\nabla) \mathbf f. $$ There are four terms on the right hand side, and there is something strange going on: the first and the fourth term are vectors, whilst the second and third term are linear operators. Such blatent inhomogeneity shocks me.

It turns out that the 2nd and 3rd term cancel out, so we finally "prove" the result. But I still cannot see what's going on - if those two terms did not cancel out, then the entire expression would be meaningless.

What's really going on? Is it the right things to do?

2

There are 2 best solutions below

2
On

You mustn't think of the del operator as an ordinary vector. In particular, the proof of the identity you tried to use requires $a,\,b$ to have components that commute with anything, just as the usual result for $c\times(a\times b)$ makes assumptions that prevents us using $c=\nabla$ to recover the very identity you've tried to use.

The best approach with identities such as these is to calculate the $i$th component as $\epsilon_{ijk}\partial_j\epsilon_{klm}\partial_lf_m$, with implicit summation over repeated indices. Since $\epsilon_{ijk}\epsilon_{klm}=\delta_{il}\delta_{jm}-\delta_{im}\delta_{jl}$, this becomes$$(\delta_{il}\delta_{jm}-\delta_{im}\delta_{jl})\partial_j\partial_lf_m=\partial_i\partial_jf_j-\partial_j\partial_jf_i=(\nabla(\nabla\cdot f)-\nabla^2f)_i,$$as expected.

0
On

Contrary to the other answer, you can treat $\nabla$ like a vector; what you have to do, however, is keep track of what you are differentiating. This can be achieved in a compact way by using diacritics; for example, using the basic identity $$ a\times(b\times c) = (a\cdot c)b - (a\cdot b)c \tag{$*$} $$ we can prove your first identity (the curl of a cross product) using a generalized product rule: $$\begin{aligned} \nabla\times(a\times b) &= \dot\nabla\times(\dot a\times b) + \dot\nabla\times(a\times\dot b) \\ &= (\dot\nabla\cdot b)\dot a - (\dot\nabla\cdot\dot a)b + (\dot\nabla\cdot\dot b)a - (\dot\nabla\cdot b)\dot a \\ &= (b\cdot\nabla)a - (\nabla\cdot a)b + (\nabla\cdot b)a - (b\cdot\nabla)a. \end{aligned}$$ In the first and second line, the over-dots in each term specify what $\nabla$ is differentiating. In the second line we have applied ($*$) to each term, and in the last line we have returned to the "traditional" notation where $\nabla$ differentiates what is immediately to its right.

What goes wrong in your derivation is that you're trying to "differentiate a derivative"; specifically, using the second line above, you're saying that $$ \nabla\times(\nabla\times f) = (\dot\nabla\cdot f)\dot\nabla - (\dot\nabla\cdot\dot\nabla)f + (\dot\nabla\cdot\dot f)\nabla - (\dot\nabla\cdot f)\dot\nabla. $$ A term like $\dot\nabla\cdot\dot\nabla$ is "the divergence of the gradient operator itself"; this is nonsensical. Similarly $(\dot\nabla\cdot f)\dot\nabla$ is nonsensical. (Though the term $(\dot\nabla\cdot\dot f)\nabla$ does make sense in-and-of-itself, but is a differential operator.) What we can do is simply return to ($*$) and see that $$ \nabla\times(\nabla\times f) = \hat\nabla\times(\dot\nabla\times\hat{\dot f}) = (\hat\nabla\cdot\hat{\dot f})\dot\nabla - (\hat\nabla\cdot\dot\nabla)\hat{\dot f} = \nabla(\nabla\cdot f) - \nabla^2 f. $$ Let break this down a bit.

  • The term $\hat\nabla\times(\dot\nabla\times\hat{\dot f})$ is saying "differentiate $f$ with $\dot\nabla$ first, giving $\nabla\times f$; then differentiate that with $\hat\nabla$, giving in traditional notation $\nabla\times(\nabla\times f)$.
  • Then we apply ($*$). Notice how we don't get a product rule split like previously since there is only one thing to differentiate.
  • The term $(\hat\nabla\cdot\hat{\dot f})\dot\nabla$ is something that cannot be expressed in traditional notation; expressions like this are likely where the idea that "$\nabla$ can't be treated like a vector" comes from. We are differentiating $f$ with $\dot\nabla$ first, then with $\hat\nabla$. However, because partial derivatives commute we get $$ (\hat\nabla\cdot\hat{\dot f})\dot\nabla = (\hat\nabla\cdot\dot{\hat f})\dot\nabla = \nabla(\nabla\cdot f). $$ In the first equality, we swap which $\nabla$ is differentiating $f$ first. So then $\hat\nabla$ is differentiating $f$ first, yielding $\nabla\cdot f$ in traditional notation, and then $\dot\nabla$ is differentiating this, yielding $\nabla(\nabla\cdot f)$ in traditional notation.
  • Finally, $(\hat\nabla\cdot\dot\nabla)\hat{\dot f}$ is just $(\nabla\cdot\nabla)f$ in traditional notation, which is $\nabla^2 f$.