Let $f(z):\mathbb R^m\rightarrow \mathbb R$ be a real-valued function from $\mathbb R^m$ to $\mathbb R$. Let $A^{m\times n}, x^{n\times 1}$, and let $g(x)=f(Ax)$.
Find the gradient and hessian of g(x) in terms of A, $\nabla f(x)$, and the hessian $H(f(x))$
I tried using the chain rule $\underbrace{\nabla g(x)}_{n\times 1}=\underbrace{A}_{m\times n} \underbrace{\nabla f(Ax)}_{n\times 1}$
But it seems as if the dimensions don't work out. Is the dimension of $\nabla g(x)$ actually mx1? I thought it should be nx1 because there are n elements of x. If that is the case, my application of chain rule is probably wrong.
Let $h : \mathbb{R}^n \to \mathbb{R}^m$ be the map $h(x)=Ax$. Then $h$ is linear and $\mathrm{d}h(x) = A$. Remark that $g = f\circ h$. The chain rule then says $$ \forall x \in \mathbb{R}^n,~ \mathrm{d}g(x) = \mathrm{d}f(h(x))\circ \mathrm{d}h(x) $$ Their gradients are defined thanks to the euclidean metric : \begin{align} \forall x\in \mathbb{R}^n, \forall v \in \mathbb{R}^n,~ \langle\nabla g(x),v \rangle_{\mathbb{R}^n} &=\mathrm{d}g(x)v \\ &= \mathrm{d}f(h(x))\circ \mathrm{d}h(x)v \\ &= \mathrm{d}f(Ax) Av \\ &= \langle \nabla f(Ax),Av\rangle_{\mathbb{R}^m} \\ &= \langle A^T\nabla f(Ax),v\rangle_{\mathbb{R}^n} \end{align} the last equality resulting of the fact that for all $u\in \mathbb{R}^n,\forall v\in \mathbb{R}^m,~ \langle Au,v \rangle_{\mathbb{R}^m} = \langle u,A^Tv\rangle_{\mathbb{R}^n}$. It follows that $\nabla g(x) = A^T\nabla f(Ax) \in \mathbb{R}^n$