Simulated data not matching with calculated probability function

50 Views Asked by At

I love the FiveThirtyEight Riddler puzzle. I try to do it as much as possible, but I'm not really trained in math or stats. This weeks puzzle has you calculating the expected distance of two random "jumps" in a 2D plane. The jumps can be any direction (0-360 degrees) and are always the same length (let's say r). The problem asks what is the expected distance from the starting point.

I simplified this problem to start with the second jump (since the first will always be distance r). I tried to perform a change of variable to the probability function:

$$P[\Theta] = \frac{\Theta}{2\pi}$$

using the function

$$L = 2 r Sin\left(\frac{\theta}{2}\right) \longrightarrow \theta=2ArcSin\left(\frac{L}{2r}\right)$$

Where L is the length from the starting point (see figure below) Distance jumped

After I do the re-arranging and take the jacobian of the new function, the probability function I end up with is: $$P(L) = \frac{2ArcSin\left(\frac{L}{2r}\right)}{\pi\sqrt{1-\frac{L^2}{4r}}}$$

I multiply by 2 since the function is symmetric about 180-degrees.

However, if I integrate over this function from 0 to 2r, the result is pi/2, not 1 as I would expect...

If I simulate the jumps and plot the simulations over this probability density function, they are "similar" but far from identical:

simulation, setting r=1

Where am I going wrong with calculating the probability density function?

Thanks!

1

There are 1 best solutions below

0
On BEST ANSWER

Now we know (from the comments): the proper probability density as a function of $\theta$ is $p(\theta)=1/(2\pi)$, not $p(\theta)=\theta/(2\pi)$. The latter is instead the cumulative probability distribution.

The answer is that for a step of one unit the average overall two-step distance is $4/\pi>1$. Similarly planets making roughly circular and coplanar orbits with Earth are on average farther from Earth than one would calculate by just averaging the nearest and farthest distances.