I am following Pattern Recognition and Machine Learning by Bishop and in problem 1.20 we try to find the probability density of a thin shell of radius $r$ of a multidimensional gaussian.
The probability density of the multidimensional gaussian is given by:
$p(x) = \frac{1}{(2\pi\sigma^2)^{D/2}}\exp(-\frac{\left \| \textbf{x} \right \|^2}{2\sigma^2})$
And we need to show that the probability density over radius $r$ and thickness $\epsilon$ is given by:
$p(r) = \frac{S_{D}r^{D-1}}{(2\pi\sigma^2)^{D/2}}\exp(-\frac{r^2}{2\sigma^2})$
where $S_{D}$ is the surface area of the unit $D$ dimensional sphere.
The author suggests converting to polar coordinates and "integrating out the direction variables" (not sure what this means). However, in the solution it does not seem as if he uses polar coordinates, but only the fact that $\left \| \textbf{x} \right \|^2 = r^2$. The solution is:
We assume constant density over the thin shell $\epsilon$, so we can consider $p(x)$ a constant. Additionally we use the fact that the surface area of the $D$ dimensional sphere is given by $S_{D}r^{D-1}$ and I assume we can approximate constant surface area over the shell $\epsilon$ so that Vol(Shell) = $S_{D}r^{D-1} * \epsilon$:
$ \int_{shell}^{} p(x) \,dx = p(x)\int_{shell}^{}\,dx = p(x) * Vol(shell) = \frac{1}{(2\pi\sigma^2)^{D/2}}\exp(-\frac{r^2}{2\sigma^2}) * S_{D}r^{D-1} * \epsilon$
Hence we find $p(r)$ as stated above. However, it seems as if we never made a change of polar variables. If, I were to change $p(x)$ to polar coordinates, I would expect something such as
$\frac{1}{(2\pi\sigma^2)^{D/2}}\exp(-\frac{r^2}{2\sigma^2})r dr d\theta \ldots$
where the $\ldots$ are $D-1$ angles.
So my questions are: Are we even changing to polar coordinates? If yes, how do the manipulations look like? How do we end up getting rid of the $r$ in the expression arising from the Jacobian? Lastly what is meant by "integrating out the direction variables"?