I was investigating the distribution of the numbers in a Farey sequence and found some pattern.
It is known that the number of elements in Farey sequence can be found using Euler totient function. So let the number of elements in the Farey sequence of order $n$ be $\left | F_n \right |$.
Also let's divide the [0, 1] number line into intervals: $(0; \frac{1}{n}]$, $(\frac{1}{n}, \frac{1}{n - 1}]$, $(\frac{1}{n-1}, \frac{1}{n - 2}]$, ..., $(\frac{1}{3}, \frac{1}{2}]$, $(\frac{1}{2}, 1]$ and count the number of elements in each interval.
If one will take intervals in a reverse order ($(\frac{1}{2}, 1]$ is the first one, $(\frac{1}{3}, \frac{1}{2}]$ is the second and so on) I found that number of elements in the first interval is exactly half of the $\left | F_n \right |$. This is not really exciting and easy to prove seeing that elements in the sequence are symmetrical around $1/2$.
What I found surprising is that the number of elements in each other interval is approximately equal to $\frac{\left | F_n \right |}{6}$, $\frac{\left | F_n \right |}{12}$, $\frac{\left | F_n \right |}{20}$, $\frac{\left | F_n \right |}{30}$, $\frac{\left | F_n \right |}{42}$.
Which brings me to a hypothesis that the number of elements in the i-th interval from the right is approximately equal to:
$$\frac{\left | F_n \right |}{i \cdot (i + 1)}$$
Which sounds kind of correct, because if I will add the elements in all intervals, I will get:
$$\sum_{i=1}^{n }\frac{\left | F_n \right |}{i \cdot (i + 1)} = \left | F_n \right |\sum_{i=1}^{n }\frac{1}{i \cdot (i + 1)}=\left | F_n \right |\cdot (1 - \frac{1}{n + 1})\approx \left | F_n \right |$$
I have a couple of questions:
- where can I read more information about this approximation (I assume that someone has studied it)?
- is there a way to calculate precisely* the number of elements in each interval (or is there a way to prove "my" approximation)?
P.S.* I know that I can calculate the numbers by generating all the elements in the sequence and then going through each of them and checking which interval it belongs to (this is how I found the hypothesis). The problem is that the sequence grows quadratically and takes to much time. So I am looking for a way to calculate the numbers without generating the sequence.