Problem:
Given a positive integer $N$, we perform the following algorithm:
Repeat till $N > 1$:
$\ \ \ N=N-\pi(N)$;
where $\pi(N)$ is number of prime numbers less than or equal to $N$.
As a second part to this question I also need sum of all N's generated at each iteration.
My thoughts: I am sensing something similar to harmonic like $N+N/2+N/3+N/4+...$ but decay is slower.
I wrote a script for various N, here are the results:
N=10 ,iterations=3 ,sum=20
N=100 ,iterations=9 ,sum=330
N=1000 ,iterations=19 ,sum=5141
N=10000 ,iterations=34 ,sum=72539
N=100000 ,iterations=55 ,sum=949457
N=1000000 ,iterations=80 ,sum=11778194
N=10000000, iterations=111, sum=140785844
N=100000000, iterations=147, sum=1638425639
N=1000000000 iterations=189 sum=18690825960
N=10000000000, iterations=236, sum=209971947776
N=100000000000, iterations=288, sum=2330297134561
N=1000000000000, iterations=345, sum=25608180074716
From data as I can see number of iterations is relatively small but having said that the difference in number of iterations between $10^7$ and $10^{12}$ is not much,
Summarise:
Not looking for closed form, looking for asymptotics(with detailed derivation)
Let's try iterating $$ f(y) = y - \frac{y}{\log y} $$ instead. Because $\pi(y)=y/\log y + O(y/\log^2y)$ this should be a reasonable first order estimate.
The basic idea is to see how many iterations of $f$ it takes to get from $x$ to $x/2$. Once we have this, just sum and you're done.
But once you're in $(x/2, x]$ everything is easier because $\frac{y}{\log y}$ is between $\frac{x}{2\log x}$ and $\frac{x}{\log(x/2)}$, so the number of iterations is between $$ \frac{x}{2}\cdot\frac{\log(x/2)}{x} = \frac{\log(x/2)}{2} = \frac12\log x - \frac12\log2 $$ and $$ \frac{x}{2}\cdot\frac{2\log x}{x} = \log x. $$
So the number of iterations is (with $\ell = \lfloor\operatorname{lg}x\rfloor$ the binary logarithm rounded down) between $$ \sum_{k=1}^{\ell} \frac12\log(2^k)-\frac12\log2 = -\frac{\log2}{2}\ell + \frac{\log2}{2}\sum_{k=1}^{\ell} k = \frac{\log2}{4}\ell(\ell+1)-\frac{\log2}{2}\ell = \frac{\log2}{4}\ell(\ell-1) $$ and $$ \sum_{k=1}^{\ell} \log(2^k) = \log2\sum_{k=1}^{\ell} k = \frac{\log2}{2}\ell(\ell+1) $$ modulo a bit of fiddling around the endpoints.
To get a better constant, ro-do this argument with a smaller number than 2, or even use $1+\varepsilon$ and use limits.
For the original problem, then, the number of iterations should then be around $\asymp \log^2x$ with a proportionality constant perhaps around a third.