Consider a probability density $p_x(x)$ defined over a continuous variable x, and suppose that we make a non-linear change of variable using:
$$x=g(y)$$
so that the density transforms according to equation:
$$p_y(y) = p_x[g(y)]~|g'(y)|$$
(1) Non-linear case: By differentiating this equation, show that the location $\hat{y}$ of the maximum of the density in y is not in general related to the location of $\hat{x}$ of the maximum of the density over x by the simple functional relation: $\hat{x} = g(\hat{y})$ as a consequence of the Jacobian factor.
(2) Linear case: Show that in the case of a linear transformation, the location of the maximum transfrers in the same way as the variable itself.
(From TextBook: Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006, exercise 1.4, page 58)
Differentiating using the chain rule gives: $$ p'_y(y)=p'_x(g(y))\cdot g'(y) \cdot |g'(y)| + p_x(g(y))\cdot\frac{\partial |g'(y)|}{\partial y}. $$
At $\hat{x}$, the first term equals zero, because $p'_x(g(y))=0$ if $g(y)=\hat{x}$. However, because of the second term - which is in general nonzero in the non-linear case - $p'_y(y)$ does not equal to $0$ if $g(y)=\hat{x}$.
In the linear case, we have $\frac{\partial |g'(y)|}{\partial y}=0$, so the second term equals zero. Hence, in this case, we have $\hat{x}=g(\hat{y})$.