There are several very important vector identities involving $\nabla$ that I struggle to understand. To give an example, in the derivation of the wave equation from maxwell's equations, the following identity is used: $$ \nabla\times (\nabla\times \mathbf f)=\nabla(\nabla\cdot \mathbf f)-\nabla^2\mathbf f. \label{1} $$ I can prove it by direct calculation, but that would be very boring and mechanical. To get a more intuitive proof, I tried this: we have $$ \nabla\times (\mathbf a\times \mathbf b)=\mathbf a(\nabla\cdot\mathbf b)-\mathbf b(\nabla\cdot\mathbf a)+(\mathbf b\cdot\nabla)\mathbf a-(\mathbf a\cdot\nabla)\mathbf b $$ Plugging in $\mathbf a=\nabla$ and $\mathbf b=\mathbf f$, we get $$ \nabla\times (\nabla\times \mathbf f)=\nabla(\nabla\cdot \mathbf f)-\mathbf f (\nabla\cdot \nabla)+(\mathbf f\cdot \nabla)\nabla-(\nabla\cdot\nabla) \mathbf f. $$ There are four terms on the right hand side, and there is something strange going on: the first and the fourth term are vectors, whilst the second and third term are linear operators. Such blatent inhomogeneity shocks me.
It turns out that the 2nd and 3rd term cancel out, so we finally "prove" the result. But I still cannot see what's going on - if those two terms did not cancel out, then the entire expression would be meaningless.
What's really going on? Is it the right things to do?
You mustn't think of the del operator as an ordinary vector. In particular, the proof of the identity you tried to use requires $a,\,b$ to have components that commute with anything, just as the usual result for $c\times(a\times b)$ makes assumptions that prevents us using $c=\nabla$ to recover the very identity you've tried to use.
The best approach with identities such as these is to calculate the $i$th component as $\epsilon_{ijk}\partial_j\epsilon_{klm}\partial_lf_m$, with implicit summation over repeated indices. Since $\epsilon_{ijk}\epsilon_{klm}=\delta_{il}\delta_{jm}-\delta_{im}\delta_{jl}$, this becomes$$(\delta_{il}\delta_{jm}-\delta_{im}\delta_{jl})\partial_j\partial_lf_m=\partial_i\partial_jf_j-\partial_j\partial_jf_i=(\nabla(\nabla\cdot f)-\nabla^2f)_i,$$as expected.