Two generators of random points in a disk

104 Views Asked by At

I have two ways to generate points in a disk:

The first is: $ a, b \sim U[0, 1]. $ Point is generating the next way: $(a \cos(2\pi b), a \sin(2 \pi b))$.

The second is: $a, b \sim U[-1, 1]$ and point generating only if $a^2 + b^2 \le 1$.

Suppose I've given points already distributed. I want to derive by what method (first or second) it was produced. By the way graph of this two method is: enter image description here

It's easy to see, that for the first method points are more 'squeezed' to zero than for the second method.

My approach: is very clumsy. I'm finding probability that points are more concentrated to the center in first method, i.e. I'm finding $\mathbb{P}(-0.2 < x < 0.2)$ for each method. I'm doing it using Cumulative Distribution Function estimation: $$ \tilde F(t)=\frac{1}{n + 1} \left(\frac{1}{2} + \sum_{i: x_i < t} 1 \right) $$ For the first it is near the $0.4$, while for the second it is more near the $0.25.$

Any ideas how to solve it more mathematical way?

1

There are 1 best solutions below

5
On BEST ANSWER

The test you suggest is reasonable; let's spice it up a bit.

The second sampling scheme you mention is called acceptance-rejection sampling and produces data that are uniformly distributed on the unit disk $B=B(0,1)$. So we can test the hypothesis that our data is produced from the distribution $X \sim U(B)$; i.e.

$$\begin{align*} H_0&: X \sim U(B) \\ H_1&: X \not\sim U(B) \end{align*}$$

A general statistical test for the hypothesis that data arises from a certain distribution is the $\chi^2$ test. To run it in this case, pick radii $0 = r_0 < r_1 < r_2 < \ldots < r_m = 1$. Now, we let $d_i$ be the number of points contained between the circles of radius $r_{i-1}$ and $r_i$. Under the null hypothesis, you expect $d_i$ to be close to $E_i = (r_i^2 - r_{i-1}^2) n$, where $n$ is the total number of data points in your sample.

We can then find the observed $\chi^2$ statistic for this test by computing: $$L^2 = \sum_{i=1}^m \frac{(d_i - E_i)^2}{E_i} $$ Under the null hypothesis $L^2$ is approximately $\chi^2$-distributed with $m-1$ degrees of freedom, so you may use the probability $P(\chi_{m-1}^2 > L^2)$ as a decision tool to accept or reject the null.