Let $g(w)= \|Y_n - f(w,X_n) \|^2$ where $f:\Bbb R^d \times \Bbb R^m \to \Bbb R^k : w \in \Bbb R^d$. What is the gradient of $g$ ? $X_n$ and $Y_n$ are random vectors.
Basically, I want to find gradient of a function like $\phi(x) = ||g(x)||^2$
where $g$ is a vector-valued function.
$$\phi(x) = \|g(x)\|^2$$
Now, $$\phi(x+h) = \|g(x+h)\|^2 = \|g(x) + h^TQ\|^2 = \|g(x)\|^2 + \|h^TQ\|^2 + 2\langle g(x),h^TQ\rangle.$$ How to find gradient from this ? Here $Q$ is a matrix where each column is the corresponding gradient for that co-ordinate.
I don't think with this method I can get it independent of $h$
The answer is $J(x)g(x)$. Where $J$ is the jacobian for the function $g$. This comes directly from 1st-order taylor series expansion.