For $g(y) = f(D^{\frac{1}{2}}y)$ where $D^{\frac{1}{2}}$ is a matrix to the power half and $ x = D^{\frac{1}{2}}y$
Then $\nabla g(y) = \nabla f(D^{\frac{1}{2}}y) = D^{\frac{1}{2}} \nabla f(D^{\frac{1}{2}}y) = D^{\frac{1}{2}} \nabla f(x)$
What I struggle with is the form of the Hessian given in Amir Beck's Introduction to non-linear optimisation on page 63. Why is the $D^{\frac{1}{2}}$ mutliplied on the RHS in the Hessian?
$\nabla^2 g(y) = D^{\frac{1}{2}} \nabla^2 f(D^{\frac{1}{2}}y) D^{\frac{1}{2}} = D^{\frac{1}{2}} \nabla^2 f(x) D^{\frac{1}{2}}$
Imagine that you have $A^{m\times n}, y^{n\times 1}$, and have the function $f:\mathbb R^m\rightarrow\mathbb R$. Let $g(y)=f(Ay)=f(x)$ with $x^{m\times 1}=Ay$
The gradient of g at y is given by
$$\begin{split}\underbrace{\nabla g(y)}_{\in\mathbb R^n}&=A^T\nabla f(Ay)\\ &=\underbrace{A^T}_{\in\mathbb R^{n\times m}}\underbrace{\nabla f(x)}_{\in\mathbb R^{m\times 1}}\end{split}$$
Then the hessian is given by
$$\begin{split}\underbrace{\nabla^2 g(y)}_{\in\mathbb R^{n\times n}}&=\nabla\left[A^T\nabla f(Ay)\right]\\ &=A^T\nabla^2f(Ay)A\\ &=\underbrace{A^T}_{\in\mathbb R^{n\times m}}\underbrace{\nabla^2f(x)}_{\in\mathbb R^{m\times m}}\underbrace{A}_{\in\mathbb R^{m\times n}}\end{split}$$
Thus $(D^{\frac 1 2})^T=D^{\frac 1 2}$ is multiplied on the left and $D^{\frac 1 2}$ is multiplied on the right.