Average number of steps in the Collatz conjecture

718 Views Asked by At

I was playing with the Collatz conjecture and decided to check it for the first 100,000,000 natural numbers. I divided this range in equal intervals of 100,000 and calculated the average number of steps to reach 1 in each one of these intervals. By graphing these averages, I got the following result enter image description here

Which seems to show a very clear logarithmic behavior. By doing a least squares regression (shown in blue), I derived that the average number of steps ($S$) for a number $n$ to reach 1 is $$S\approx10.45\ln(n)-3$$ Is there a way to derive this logarithmic relation?

2

There are 2 best solutions below

0
On BEST ANSWER

The function $S(n)$ can have its value estimated by considering the expected value $s(n)$, which corresponds to the average number of steps near $n$. For some "random" $n$, there is a $1/2$ probability that it is even or odd, therefore, $$s(n)=\frac{1}{2}\left(1+s\left(\frac{n}{2}\right)\right)+\frac{1}{2}\left(2+s\left(\frac{3n+1}{2}\right)\right)$$ If $s(n)=b\ln(n)$, then $$b\ln(n)=\frac{1}{2}\left(1+b\ln\left(\frac{n}{2}\right)\right)+\frac{1}{2}\left(2+b\ln\left(\frac{3n+1}{2}\right)\right)$$ Solving for $b$, $$b=\frac{3}{\ln\left(\frac{4}{3+1/n}\right)}$$ which, for $n\rightarrow \infty$, $$b=\frac{3}{\ln\left(4/3\right)}\approx10.43$$ Which matches the value found by the regression.

Note that $s(n)=b\ln(n)+c$ is also a solution to the first equation for any constant $c$, and the value of $b$ does not change as all terms with $c$ are cancelled out. Determining its value is addressed in another post I made.

2
On

This is not an answer but an extended comment

As @Peter already observed, the fluctuations are heavy. So I tried to reproduce your statistic, hoping that the $S=$ number-of-even-steps in the transformation rule "$T(x):$ $3x+1$ when $x$ is odd, $x/2$ when $x$ is even" is what you've evaluated.

I only could go up to $10 \; 000 \; 000$ values at all, so I show the scattering in bins of size=1000 (blue), size=10000 (green), size=100000 (red). (The trendline is in red, based only on the 1000-size bins).
Different from your picture I scale the x-axis logarithmically and get a near linear trend. However, the fluctuations increase with the index of the bins - independently, how large the bins are: an interesting observation on its own.

image1

After the fluctuations increase with the index-number of the bins, I think, a variable/an increasing binsize might be appropriate. Here I show the average number of even steps, when the size of bins are doubled with the index. Moreover, I took as x-coordinate the geometric mean value of the bin-bounds. See this:

image2

Here we get a nice nearly linear trend. The slope $7.216$ divided by $\ln(2)$ gives about $10.4104$ which agrees roughly with the slope that has been found in the OP.
Again: the increasing fluctuations over the bins of equal size seem a remarkable effect to me.