Note: Posted in MO since (2) is open in MSE
Let $p_n$ be the $n$-th prime and $p_n < c_n < p_{n+1}$ be the composite number such that $c_n$ has the largest prime factor $l_n$ in this prime gap. If the largest prime factors occurs in two or more composites then we take $c_n$ to be the smallest among them.
Question 1: Is it true that
$$ \lim_{n \to \infty}\frac{1}{n} \left(\frac{c_2 + c_3 + \cdots c_n}{l_2 + l_3 + \cdots l_n}\right)\left(\frac{l_2}{p_2} + \frac{l_3}{p_3} + \cdots \frac{l_n}{p_n}\right) = 1 \tag 1 $$
Moreover, if we look at the individual components $$ \lim_{n \to \infty}\frac{1}{n} \left(\frac{l_2}{p_2} + \frac{l_3}{p_3} + \cdots \frac{l_n}{p_n}\right) \tag 2 $$ we observe that its value close to $0.2614$
Experimental data for $(2)$:
- $n = 10^8$, mean $\approx 0.27815$
- $n = 10^9$, mean $\approx 0.27578$
- $n = 3.5 \times 10^9$, mean $\approx 0.27470$
Update: 14-Dec-2023 I evaluated the above ratios for consecutive prime gaps for $p_n > 10^{40}, p_n > 10^{50}$ and $p_n > 10^{60}$ respectively. The limiting value in these tests were $0.2592, 0.2551$ and $0.2511$ which show a very slow decreasing trend.
Question: Does the above limit exist? If yes, what does it converge to?
Disclaimer: not a proof or answer but just a long thought because it was fun to think about this problem, and I'm no mathematician so things below are sloppy and need much better definitions. Have a shoot on it...
When using a probabilistic approach I would think that we can use:
$$ \lim_{n \to \infty}\frac{1}{n} \left(\frac{l_2}{p_2} + \frac{l_3}{p_3} + \cdots \frac{l_n}{p_n}\right)=\lim_{n \to \infty}\frac{l_n}{p_n}. $$
where the RHS must be read in a probabilistic sense...
So expanding on that, the average prime gap corresponding with $p_n$ equals $G_n \approx \log(p_n)$. Now let $m$ be an integer multiplier so that $c_n = m\cdot l_{m,n}$, and prime $l_{m,n}$ must then originate from the sequence of numbers of length $N_{m,n}=G_n/m$ around $p_n/m$. The latter is a rough formulation but acceptable since the gap $G_n$ is small compared to $p_n$.
The average prime gap at $p_n/m$ equals $G_{m,n}=\log(p_n/m)$. We can estimate the probablity that we indeed find our $l_{m,n}$ in the number sequence of length $N_{m,n}$ around $p_n/m$: assume three subsequent primes around $p_n/m$ such that the number sequence of length $N_{m,n}$ is in between the outer 2 primes. The middle prime, $l_{m,n}$, can now be somewhere between the two outer primes which are on average separated by $2G_{m,n}$. We then get:
$$ P_{m,n}=\frac{N_{m,n}}{2G_{m,n}}=\frac{log(p_n)/m}{2\log(p_n/m)}=\frac{log(p_n)}{2m(\log(p_n)-\log(m))}. $$
We can now calculate the expected value of $l_n$:
$$ l_n=\sum_{m=2}^\infty l_{m,n} P_{m,n}\prod_{k=2}^{m-1}(1-P_{k,n}) $$
where the summation should be limited to $m_{max}=\sqrt{p_n}$, but we don't care for now. The product term is an additional weight for each $m$ representing the probability that a solution for lower values of $m$ is not found because we want the highest $l_{m,n}$ (this was an intuitive move from me, maybe plain wrong...). With $l_{m,n} \approx p_n/m$:
$$ l_n=\sum_{m=2}^\infty \frac{p_n}{m} P_{m,n}\prod_{k=2}^{m-1}(1-P_{k,n}) $$ $$ \frac{l_n}{p_n}=\sum_{m=2}^\infty \frac{1}{m} P_{m,n}\prod_{k=2}^{m-1}(1-P_{k,n}) $$
For large values of $p_n$, we have $\log(p_n) \gg \log(m)$ as we can take $p_n$ arbitrarily large so that our result converges well enough for $m \ll p_n$. We then get:
$$ P_{m,n}\approx \frac{1}{2m} $$
Using these probabilities we get:
$$ \lim_{n\to\infty} \frac{l_n}{p_n} = 0.22734 $$
Convergence seems to be stable to 4 digits after $m_{max}=1000$ or so.
The next thing we can do is to abandon the purely average prime gaps and add some variation to see what it numerically does. In the above calculation for $\frac{l_n}{p_n}$, I multiplied the gap-width with a modifier from a beta distribution $(\alpha,\beta)=(1,5)$ and multiplied the horizontal axis by 4 so the mean value of the modifier is 1 (one) and the average prime gap is still correct. Nothing theoretically supported, just to see what happens... The above calculation was wrapped in 2 for-loops, one modifier for $G_n$ and one modifier for $G_{m,n}$, which is rather simplistic but it gives an idea of the numerical impact. Now I get:
$$ \lim_{n\to\infty} \frac{l_n}{p_n} = 0.24737 $$
which is only 10% separated from the 0.27 from exact numerical analysis.
That's as far as I got.... So some questions then: