Gradient of a Misfit Function

36 Views Asked by At

My question is related to eq. 10 in the paper https://aip.scitation.org/doi/10.1063/1.5098051.

I am unable to understand how the author arrived at eq. 10 from eq.9. Also, not sure to how to apply the gradient chain rule here. It has been a struggle to understand eq. 10. Please help.

Eq.9 is given as:

$$f^\nu(\tau,\gamma) = \frac{1}{2}\sum_{n=1}^N|g^\nu(\tau, \gamma, t_n)-d(t_n)|^2$$

Eq.10 is given by:

$$\nabla f^\nu(\delta \tau,\delta \gamma) = \sum_{n=1}^N<\nabla g^\nu(\delta \tau, \delta \gamma, t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$

Where

$$\nabla g^\nu(\delta \tau, \delta \gamma, t_n)$$

is the directional derivative of $g^\nu$ in the direction of ($\delta\tau, \delta \gamma$).

My Attempt

Considering eq.9

Inner product in terms of the norm:

$$f^\nu(\tau,\gamma)=\frac{1}{2}\sum_{n=1}^N|g^\nu(\tau, \gamma, t_n)-d(t_n)|^2=\frac{1}{2}\sum_{n=1}^N<g^\nu(\tau, \gamma, t_n)-d(t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$

Taking the gradient of $f^\nu(\tau,\gamma)$:

$$\nabla f^\nu(\tau,\gamma) = \frac{1}{2}\sum_{n=1}^N\nabla<g^\nu(\tau, \gamma, t_n)-d(t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$

Applying the chain rule:

$$\nabla f^\nu(\tau,\gamma) = \sum_{n=1}^N<\nabla g^\nu(\tau, \gamma, t_n), g^\nu(\tau, \gamma, t_n)-d(t_n)>$$

My questions are:

  1. I don't arrive at the directional derivative of $g^\nu$ which is present in eq.10 of the paper. I'm not sure where I'm going wrong.
  2. In equation 10, the directional derivative of $g^\nu$ is a scalar, while $g^\nu(\tau,\gamma, t_n)-d(t_n)$ is a vector (am I correct in saying this is a vector?). How do you take the dot product of a scalar and a vector?