It is an open question as to whether there are infinite Mersenne primes. However, by the prime number theorem the probability that a large random number $z$ is prime is approximately 1/ln($z$). We can apply this to the Mersenne numbers ($z$ = 2^n-1), to figure out the expected number of Mersenne primes up to (2^n -1) (let's call this function E(n)).
E(n) = 1/ln(2^2-1) + 1/ln(2^3-1) + 1/ln(2^4-1) + ... + 1/ln(2^n-1)
E(n) > 1/ln(2^2) + 1/ln(2^3) + 1/ln(2^4) + ... + 1/ln(2^n)
E(n) > 1/(2ln2) + 1/(3ln2) + 1/(4ln2) + ... 1/(nln2)
E(n) > 1/ln(2) * (1/2 + 1/3 + 1/4 + ... + 1/n)
E(n) > ln(n)/ln(2)
Thus, as n approaches infinity, E(n) approaches infinity. But does this mean that the actual number of Mersenne primes is infinite?
The question: can this sort of probabilistic reasoning be used to prove the infinitude of Mersenne primes?
The only way there could not be infinite Mersenne primes is if the Mersenne numbers were systematically less likely to be prime than other numbers in their vicinity (to use a trivial example, the even numbers are less likely to be prime).
We can test how well this model matches reality using the list of 50 known Mersenne primes. This is the result.
The x-axis is on a logarithmic scale, such that the y-axis is the number of Mersenne primes up until 2^(2^x)-1. The orange line gives the approximation E(n) described above. As E(n) = approx ln(n)/ln(2); E(2^x) = x.
As you can see, the blue line (actual number of Mersenne primes) is consistently and increasingly above the orange line (expected number). I realised that this may be because Mersenne numbers cannot be even, and hence are twice as likely to be prime as a random number of that size. In other words, the formula for expected number of Mersenne primes should perhaps be E(2^x) = 2*ln(2^x)/ln(2) = 2x. This is represented by the yellow line, which is consistently above the actual number of Mersenne primes.
However, there is no particular reason to stop there. Compared to a random integer, Mersenne numbers are more likely to be multiples of 3. 2^(2n)-1 is a multiple of 3, meaning that a random Mersenne number has a 1/2 chance of being a multiple of 3 (and thus is less likely to be prime). Taking this into account, our formula becomes E(2^x)=1.5x. This seems to be a better approximation (green line).
There is no limit to how many prime factors we could take into account. For any prime except 2, a Mersenne number is more likely (compared to a random integer) to be a multiple of it, as 2^(p-1)-1 = pk. The more prime factors we take into account, the more we reduce our estimate of how many Mersenne primes there are.
Question: It seems that the actual number of Mersennes exceeds what this probabilistic model suggests. Why?
