Why is the distribution bimodal for the random sum $\sum_{n \geq 1} \frac{1}{\text{lcm}^2 (p_n,r_n)}$?

194 Views Asked by At

Related to my previous question, but here it's much more apparent numerically that the distribution of this sum is really bimodal:

$$S=\sum_{n \geq 1} \frac{1}{\text{lcm}^2 (p_n,r_n)}$$

Here $p_n,r_n$ are random integers uniformly distributed from $1$ to $n$. I wager that the sum converges, because the expected value of the denominator grows approximately like $n^2$.

Here is the smoothed distribution of two runs with 100 000 trials each and summing up to 25 000 terms:

enter image description here

It's actually multimodal (or at least trimodal), but I'm interested in the origin of the largest two maxima. Why do the two peaks appear?

I'm not asking for any estimation of the expected value, but if it's possible, it would be great as well.

Also the distribution for the similar sum looks almost the same:

$$S=\sum_{n \geq 1} \frac{1}{\text{lcm} (p_n^2,r_n^2)}$$

Here is the above distribution (two runs with 100 000 trials each and summing up to 25 000 terms):

enter image description here

And here are the two distributions plotted together (they seem to be the same):

enter image description here

1

There are 1 best solutions below

0
On BEST ANSWER

This is not a complete answer, but only a notation on why the final distribution is bimodal (or, better, multimodal with two predominant peaks). Because of the combined effect of the square in the denominator and the increase in the expected value of the lcm for increasing $n $, the magnitude of the terms of the summation rapidly decreases even from the initial values of $n $. Thus, most of the curve shapes showed in the OP are determined by the terms obtained for the initial values of $n $, which give a major contribution to the final result of the sum.

This behaviour plays a relevant role in determining the shape of the distribution. To better understand this, we can first consider that our partial sum for $n=1 \,\,$ is trivially equal to $1$, whereas for $n=2 \,\,$ we have $\dfrac {3}{4} $ probability that the lcm is $2$ (if one of the pairs $1,2-2,1-2,2 \,\,\,\,$ is chosen) and $\dfrac {1}{4} $ probability that the lcm is $1$ (if the pair $1,1 \,$ is chosen). So, the partial sum of the first two terms of the summation is $\dfrac {5}{4}$ in $75\%$ of cases, and $2$ in the remaining $25\%$ of cases.

Now if we consider the sum of the successive terms of the summation for $n \ \geq 3 \,$ (let us call it $S_{3}^{\infty} $), its distribution is independent from what has been chosen for $n=2 \,$, i.e. it is the same irrespective of whether the two first terms of the summation have given a partial sum of $\dfrac {5}{4} $ or $2$. In other words, the final overall distribution can be seen as the sum of two identical overlapped distributions, both equal to $S_{3}^{\infty} $ but relatively shifted each other along the $x$-axis, one starting from an $x$-value of $\dfrac {5}{4} $ and the other starting from an $x$-value of $2$. The two distributions are also placed at different height in the vertical direction, as a result of the different probability found for the two possible sums of the first two terms of the summation. This leads to two main peaks shifted by $0.75$ each other, with the highest one on the left. Such predicted pattern seems to be in good accordance with that visualized in the first figure of the OP. The presence of these two main peaks is also favored by the fact that, as stated above, the magnitude of the terms of the summation rapidly decreases with increasing $n $. As a result, the lower among the two distributions (the right-sided one, starting from $2$) begins on the descending branch of the left-sided distribution, creating a well defined peak.

These considerations can be extended to the successive values of $n $ to further understand the shape of the overall distribution. Repeating a similar procedure for higher values of $n$ we could obtain that the overall distribution results from the overlapping of multiple distributions with different starting points. On the other hand, because of the rapidly decreasing magnitude of the terms of the summation, the resulting peaks rapidly become smaller, smoother, and no longer visualizable. A third small peak that is still identifiable in the first figure of the OP is that depending on the choice of the two integers for $n=3 \,$. In this case, we have $\dfrac {1}{9} $ probability that the lcm is $1$ (if the pair $1,1$ is chosen) and $\frac {8}{9} $ probability that the lcm is one of the values among $2,3,6 \,$ (if another pair is chosen). So, the third term of the summation is $1$ in one case out of nine, and one of the quantities $\dfrac {1}{4}$, $\dfrac {1}{9} $, or $\dfrac {1}{36} $ in the remaining cases. Combining this with the two cases described for $n=2 \,$, and calling $k $ the value of the third term of the summation if different from $1$ (with $\dfrac {1}{36} \leq k \leq \dfrac {1}{4} \,\,$), we get that the partial sum of the first three terms of the summation is:

  • $ \dfrac {5}{4}+k \,\,\,$ in $\,\,\, \dfrac {3}{4} \cdot \dfrac {8}{9} = \dfrac {2}{3} \,\,\,$ of cases;

  • $ 2 + k \,\,\,$ in $\,\,\, \dfrac {1}{4} \cdot \dfrac {8}{9}=\dfrac {2}{9} \,\,\,$ of cases;

  • $ \dfrac {5}{4}+1=\dfrac {9}{4} \,\,\, $ in $\,\,\, \dfrac {3}{4} \cdot \dfrac {1}{9}=\dfrac {1}{12} \,\,\,$ of cases;

  • $2+1=3 \,\,\,$ in $ \,\,\, \dfrac {1}{4} \cdot \dfrac {1}{9}=\dfrac {1}{36} \,\,\,$ of cases.

By considerations similar to those above (referred to the independence of the sum $S_{4}^{\infty} $ of the remaining terms of the summation for $n \geq 4$, with respect to those already considered), we get that the final overall distribution can be seen as the sum of four overlapped distributions, the first starting between $\dfrac {5}{4}$ and $\dfrac {3}{2}$, the second starting between $2$ and $ \dfrac {9}{4}$ (actually, these two distributions are the result of the overlapping of several $S_{4}^{\infty} $ distributions starting within these ranges, according to the probability distribution of $k $), the third starting from $ \dfrac {9}{4}$, and the fourth starting from $3$. The peaks are also placed at different height in the vertical direction (with the left-sided one in the uppest position and a decreasing height for the others) as a result of the different probability found for the possible sums of the first three terms of the summation. Importantly, because the second and the third distributions are very near, they tend to merge. This leads to three peaks characterized by decreasing height, with a distance of $\approx 0.75$ between the first and the second (although the starting points of the first two distributions within the corresponding $0.25$ ranges are identified by the same value $k $, the merging of the second distribution with the third one might lead to some minimal shift of the second peak to the right, slightly increasing the distance), and a distance of $1-k \,$ (thus ranging between $0.75$ and $\dfrac {35}{36} $) between the second peak and the third one. Again, this predicted pattern seems to be in good accordance with that visualized in the first figure of the OP, where a third low peak is clearly visible on the right at a distance of about $0.85$ from the second one.

Similar considerations can also explain the presence of a multimodal distribution in the second and third figure of the OP.