I wrote a math problem that went like this:
Alice writes out all integers from 1 to $n$ on a blackboard. Each round, if there are still numbers on the board, Alice chooses a number on the board at random and erases that number and all multiples of that number. What is the expected number of rounds until there are no numbers left on the blackboard.
I had a somewhat closed form solution of:
We do a classical double counting argument, we calculate the probability that any integer $k$ is chosen. The probability that $k$ is chosen is $\frac{1}{d(k)}$ where $d(k)$ is the number of divisors of $k$. This is true because it is equally likely that $k$ or any of it's divisors is chosen. By linearity of expectation, we can take the sum of the expected values that each individual integer is chosen. This evaluates to $\sum_{k=1}^{n} \frac{1}{d(k)}$
Now, I'm curious if there's a way to further condense/bound this final sum. We can use HM-AM to bound it since the sum of divisors from $1$ to $n$ is well known, however I ran a program and the ratio between the HM and AM is ~$1.9$ for $n=100,000$.
Any help is appreciated.
Thanks !
This response can in no way be construed as an answer. It is posted as such, simply for legibility.
If by chosen, you mean erased on the first round, then I disagree. On the first round, the more divisors that $k$ has, the greater the probability that $k$ is erased.
In my opinion, asking whether a specific number $k$ will be erased on a specific round $r$, where $r > 1$ is a very complicated question. Certainly, this would require that the number $k$ not be erased on any of the prior rounds. And certainly, the chance of $k$ being erased on one of the prior rounds would somehow increase as $d(k)$ increases. By "somehow", I intend that expressing the probability as a formula involving $d(k)$ may not be easy.
Further, under the assumption that $k$ has not been erased prior to round $r$, computing the probability that $k$ will be erased on round $r$ could be a nightmare. On the $r$-th round, you have to ask how many other numbers there are expected to be that are not divisors of $k$.
The actual question that you are asking is
Suppose that you start with a set like $\{1, 2, \cdots, 100\}$ and you rank each number $k$ by its value $d(k)$. The more top-heavy the set is, the greater the expected number of numbers will be erased on a single round.
However, if you are asserting that the set $\{1,2, \cdots, 100\}$ is expected to require $\left[\sum_{k=1}^{100} ~\frac{1}{d(k)}\right]~$ rounds, I would like to see a proof of this.
Start of Edit
It just occurred to me. Perhaps what the OP was referring to by
is that whenever a number $k$ is erased, the chance that the erasure occurred because the number $k$ itself was chosen, rather than one of its smaller divisors is
$$\frac{1}{d(k)}.$$
This is certainly true. Further, perhaps I have a blind spot here. However, I am having trouble seeing how you can use this fact to directly compute the expected number of rounds that will be required to erase all numbers.
It also just occurred to me is that the question of how many rounds will be required is equivalent to asking what is the expected number of rounds that it will take before the number 1 is chosen. The erasure completes when and only when the number 1 is chosen.
The problem is that not only is this sampling without replacement, but the sampling erases a variable amount of numbers on each round.
That is, if the number $k$ is chosen on a round, then
$$\left\lfloor \frac{n}{k} \right\rfloor$$
is the maximum amount of numbers that could be erased on the round. The reason that it is the maximum is that some of the multiples of $k$ might already have been erased.
Even with this new insight, I still regard this problem as extraordinarily complicated.
End of Edit
Assuming that you agree that there is no obvious formula for the expected number of rounds that will be required, one approach to attacking this question is to start by doing simulations on a computer. Let $n$ vary from $10$ through $100$. For each value of $n$, assume that you start with the set $\{1,2,\cdots, n\}.$ For each value of $n$, run the simulation 1000 times. This should be safe, since a single simulation can not take more than $n$ rounds. Naturally, you would need some sort of random number generator, to randomly pick 1 number out of a group of numbers.
Have the computer provide both the mean number of rounds of the 1000 simulations, and some sort of display of the distribution of the number of rounds needed for each simulation.
Then, you can look for patterns, and try to reverse engineer formulas based on these patterns.
Obviously, even here, you could still be faced with a nightmare.