I am applying the mean value theorem to an analysis, but I am running into problems computing the second derivative.
let:
$$ J = d^Tf(Wx) $$
with $f$ a generic continuous and differentiable element-wise function, a matrix $W$, and vectors $x$ and $d$.
As part of an application of the mean value theorem: $$ \nabla_WJ - \nabla_{W^\prime}J = [\nabla^2_Z J] [W - W^\prime] $$ with $Z = \eta W + (1 - \eta) W^\prime$ for some $0 \leq \eta \leq 1$ I need to know $\nabla^2_Z J$
So the question is, how do I compute $\nabla_Z^2J$? For other reasons I happen to know the first derivative:
$$ \nabla^2_Z J = \frac{\partial}{\partial Z}\frac{\partial}{\partial Z} J \\ = \frac{\partial}{\partial Z}[(d \odot f^\prime(Zx))x^T] \\ $$
Although I am familiar with the $\text{vec}$ operator/notation for higher order derivatives, I got stuck with the new hadamard product in there.
(As a side note - I am only 60% sure my application of the mean value theorem in this way is valid, so any guidance you can offer about that would also be really appreciated!)
Define the variables $$\eqalign{ Z &= \eta W + (1-\eta)U \cr y &= Zx &\implies dy&=dZ\,x\cr f &= f(y) \cr f' &= f'(y) &\implies df&=f'\odot dy \cr f'' &= f''(y) &\implies df'&=f''\odot dy \cr \Gamma &= b^Tf = b:f \cr }$$ where $dy$ is the differential of $y$ and a colon represents the trace/Frobenius product, i.e. $$A:B={\rm tr}(A^TB)$$ Some further notational conventions. Uppercase latin letters stand for matrices, lowercase latin for vectors, and greek letters for scalars. And for convenience, lowercase and uppercase versions of the same letter stands for a vector and its corresponding diagonal matrix, i.e. $$B = {\rm Diag}(b)$$
First, let's find the differential and gradient of that last variable $$\eqalign{ d\Gamma &= b:df \cr &= b:f'\odot dy \cr &= b\odot f':dy \cr &= b\odot f':dZ\,x \cr &= (b\odot f')x^T:dZ \cr G = \nabla_Z\Gamma &= (b\odot f')x^T \cr }$$ Now let's find the differential of $G$ $$\eqalign{ dG &= (b\odot df')x^T \cr &= (b\odot f''\odot dy)x^T \cr &= F''B\,dy\,x^T \cr &= F''B\,dZ\,xx^T \cr }$$ This differential is the quantity that we are interested, so I won't go on to rearrange this into a hessian (which in this case is a 4th order tensor).
Now evaluate the differential for $dZ=(W-U)$ $$\eqalign{ \nabla^2_Z\Gamma : (W-U) &= F''B\,(W-U)\,xx^T \cr &= {\rm Diag}(f''\odot b)\,(W-U)\,xx^T \cr }$$ We also want to evaluate $d\Gamma$ for $dZ=\eta dW$ $$\eqalign{ d\Gamma &= G:\eta\,dW \cr &= \eta G:dW \cr \nabla_W\Gamma &= \eta G \cr }$$ and for $dZ=(1-\eta)dU$ $$\eqalign{ d\Gamma &= G:(1-\eta)\,dU \cr &= (1-\eta)G:dU \cr \nabla_U\Gamma &= (1-\eta)G \cr }$$ I'll leave it to you to determine how these quantities can be related through a mean value theorem.
Note that $\,\{f'(y),\,f''(y)\}\,$ are the first and second derivatives of a scalar function $f(\gamma)$ which are applied element-wise to a vector argument $y.\,\,$ They are not equal to $\,\{\nabla_yf,\,\nabla^2_yf\}\,$ as I had in my initial post.