I think I understand the simple case of chain rule where we want to differentiate scalar functions with scale domains. Now I am trying to apply this rule to compute the partial derivatives
$$\frac{\partial g}{\partial t} = y^T\nabla f(x + ty)$$ and $$\frac{\partial \phi}{ \partial t}= y^T\nabla^2 f(x + ty),$$
where $g(t) = f(x + ty)$ and $\phi (t) = \nabla f(x + ty)$ Could you please someone provide an educational (simple) proof on how we can compute the aforementioned partial derivatives and how the transpose appear?
For the first part, you have $$ \begin{align} \frac{d}{dt}f(x+ty) &= \sum_i D_if(x+ty)\frac{d}{dt}(x_i+ty_i) \\ &= \sum_i D_if(x+ty)y_i \\ &= \sum_i \phi_i(t)y_i \\ \end{align} $$ since you defined $\phi(t)=\nabla f(x+ty)$, in components this is $\phi_i(t)=D_if(x+ty)$. For the second part, apply $\frac{d}{dt}$ to this relation. Using the same formula again, you will get $$ \begin{align} \frac{d}{dt}\phi_i(t) &= \frac{d}{dt}D_if(x+ty) \\ &= \sum_j D_jD_if(x+ty) \frac{d}{dt}(x_j+ty_j). \end{align} $$ can you write this in matrix form? The Hessian will form because $D_j\phi_i=D_jD_if$.