My question is related to eq. 10 in the paper https://aip.scitation.org/doi/10.1063/1.5098051.
I am unable to understand how the author arrived at eq. 10 from eq.9. Also, not sure to how to apply the gradient chain rule here. It has been a struggle to understand eq. 10. Please help.
Eq.9 is given as:
$$f^\nu(\tau,\gamma) = \frac{1}{2}\sum_{n=1}^N|g^\nu(\tau, \gamma, t_n)-d(t_n)|^2$$
Eq.10 is given by:
$$\nabla f^\nu(\delta \tau,\delta \gamma) = \sum_{n=1}^N<\nabla g^\nu(\delta \tau, \delta \gamma, t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$
Where
$$\nabla g^\nu(\delta \tau, \delta \gamma, t_n)$$
is the directional derivative of $g^\nu$ in the direction of ($\delta\tau, \delta \gamma$).
My Attempt
Considering eq.9
Inner product in terms of the norm:
$$f^\nu(\tau,\gamma)=\frac{1}{2}\sum_{n=1}^N|g^\nu(\tau, \gamma, t_n)-d(t_n)|^2=\frac{1}{2}\sum_{n=1}^N<g^\nu(\tau, \gamma, t_n)-d(t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$
Taking the gradient of $f^\nu(\tau,\gamma)$:
$$\nabla f^\nu(\tau,\gamma) = \frac{1}{2}\sum_{n=1}^N\nabla<g^\nu(\tau, \gamma, t_n)-d(t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$
Applying the chain rule:
$$\nabla f^\nu(\tau,\gamma) = \sum_{n=1}^N<\nabla g^\nu(\tau, \gamma, t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$
My questions are:
- I don't arrive at the directional derivative of $g^\nu$ which is present in eq.10 of the paper. I'm not sure where I'm going wrong.
- In equation 10, the directional derivative of $g^\nu$ is a scalar, while $g^\nu(\tau,\gamma, t_n)-d(t_n)$ is a vector (am I correct in saying this is a vector?). How do you take the dot product of a scalar and a vector?