I'm trying to understand the origin of a certain formula used in the solution to the following question:
This question relates to the position probability density for a classical particle undergoing simple harmonic motion. The particle can be considered to be moving according to the classical trajectory $x = x_0 \sin \omega t$. A measurement is made of the position of the particle at a random time such that the value of the phase $\alpha = \omega t$ can be considered to take any value between $0$ and $2\pi$ with equal probability.
Considering $\alpha$ as a random variable, what is its probability density $\rho_\alpha(\alpha)$ and find $\rho_x(x)$ in terms of $x$ only.
End of question
The probability density for $\alpha$ is uniform between $0$ and $2\pi$ so $\rho_\alpha(\alpha)=\cfrac{1}{2\pi}$ within the allowed range.
Finding the derivative of $x$ gives
$\cfrac{\mathrm{d}x}{\mathrm{d}\alpha}=x_0 \cos \alpha$
Converting between random variables gives
$\color{red}{\rho_x (x)=\left|\cfrac{\mathrm{d}x}{\mathrm{d}\alpha}\right|^{-1}\rho_\alpha(\alpha)}=\cfrac{1}{2\pi x_0 \color{blue}{| \cos \alpha |}}=\cfrac{1}{2\pi x_0 \sqrt{1-\sin^2 \alpha}}=\cfrac{1}{2\pi x_0 \sqrt{x_0^2-x^2}}$
End of answer
I have two simple questions about the answer above:
- I have never seen the formula (marked red) before and was wondering if someone could explain it's origin and what it means. Below is the formula from wikipedia although i'm not sure what the inverse phi means or represents.
- For the part marked blue; it's obvious $\cos \alpha = \sqrt{1-\sin^2 \alpha}$. So why are we considering the absolute value of $\cos \alpha$?

This is not a probability specific concept. Rather, it has to do with the concept of densities and their transformation laws.
First, let us write a 1d probability integral in the following equivalent form:
$$P_\alpha(b) - P_\alpha(a) = \int_a^b p_\alpha(\alpha) e^\alpha \cdot e_\alpha \, d\alpha$$
where $e^\alpha \cdot e_\alpha = 1$. From a geometric standpoint, $e_\alpha$ is the vector tangent to the region of integration: this is an important concept in integrals, both 1d and multidimensional. Every integral contributes this part, this tangent to the region of integration.
So what is $e^\alpha$? It is the dual vector to $e_\alpha$. In 1d, it is parallel to $e_\alpha$ but reciprocal in magnitude. In multiple dimensions, it would be formed from the normal to a hypersurface of constant $\alpha$.
It is $p_\alpha(\alpha) e^\alpha$ that is a fundamental, geometric quantity in probability space.
That may seem rather arbitrary, but it is exactly this matter that is cropping up in your problem. Let's transform to the new coordinate $x$ instead of $\alpha$:
$$P_\alpha(b) - P_\alpha(a) = \int_{a}^{b} p_\alpha(\alpha) e^\alpha \cdot e_\alpha \, d\alpha$$
What happens when we change variable of integration? Here's the rule: the $e_\alpha \, d\alpha$ changes to $e_x \, dx$. You might think, "Wait, that's arbitrary, that doesn't make sense! There should be a correction. I've changed variables in a hundred 1d integrals, and there's always a factor of the Jacobian." There is and there will be, just not here. What stays fixed is the $e^\alpha$ from the probability density. That's where our Jacobian factor will come from: when we rewrite $e^\alpha$ in terms of $e^x$.
Geometrically, what's going on is this: go back to the idea of an integral as the limit of a Riemann sum. When we change variables from $\alpha$ to $x$, we're picking up different amounts of length (generally: volume or hypervolume) in $x$ space compared to when we were in $\alpha$ space, but this is entirely captured by the length of $e_x$ vs. $e_\alpha$.
So let's continue:
$$P_x (x(b)) - P_x(x(a)) = \int_{x(a)}^{x(b)} (p_\alpha \circ \alpha)(x) e^\alpha \cdot e_x \, dx$$
Now we can plug in the transformation law, which is well known from the transformation of tensors: $e^\alpha = \frac{d\alpha}{dx} e^x$. This views $\alpha = \alpha(x)$, but you can turn that into $\frac{dx}{d\alpha}^{-1}$ by the inverse function theorem. Then just use $e^x \cdot e_x = 1$, and you're done.
I've given a somewhat involved argument for the change of variables in 1d integration, but I think it's important to emphasize that that rule is not arbitrary, nor is it achieved by wanton mathematical voodoo: the rule can be derived by basic reasoning about Riemannian integrals in a geometric picture, and these concepts apply more generally to multivariable calculus. This should also emphasize the connection between tensor densities and probability densities.
Finally, I think it is very useful to think of $p_\alpha e^\alpha$ as the actual geometric quantity, at least in some situations, as this makes the transformation laws to other coordinates (other random variables) follow automatically.