I'm reading the Secure High-Rate Transaction Processing in Bitcoin (https://eprint.iacr.org/2013/881.pdf) paper which describes Greedy Heaviest-Observed Sub-Tree (GHOST) rule for main chain selection.
The paper starts by explaining that 50% attack is possible even if an attacker possesses less than 50% of network hash-rate given that longest chain rule for main chain selection is applied and block size is increased and/or block creation frequency is increased (block tree extension rate $\lambda_h$ remains constant but main chain extension rate $\beta$ decreases because number of forks increases).
The paper then introduces the GHOST rule and states:
The advantage of this suggested change to the protocol is that it maintains the security threshold for successful 50% attacks at 1 (rather than $\frac{\beta}{\lambda_{h}}$), even if the network suffers from extreme delays and the attacker does not. This allows the protocol designer to set high block creation rates and large block sizes without the fear of approaching the 50%-attack cliff edge, which in turn implies that a high transaction throughput can be securely maintained.
I'm struggling to understand the proof for the following proposition. I've put in boldface what I don't understand in the proof.
It is imperative to first show that all nodes eventually adopt the same history when following GHOST. For every block $B$ define by $\psi_B$ the earliest moment at which it was either abandoned by all nodes, or adopted by them all. We call the adoption of a block by all nodes the collapse of the fork.
Proposition 2 (The Convergence of History). $Pr(\psi_{B} < \infty) = 1$. In other words, every block is eventually either fully abandoned or fully adopted. Moreover, $E[\psi_{B}] < \infty$.
Proof. Let $D$ be the delay diameter of the network. Assume that at time $t > time(B)$ block $B$ is neither adopted by all nodes nor abandoned by all of them. Denote by $\epsilon_t$ the event in which the next block creation in the system occurs between times $t+D$ and $t+2D$, and then no other block is produced until time $t+3D$. We argue that once such an event occurs, block $B$ is either adopted or abandoned by all nodes. Indeed, between time $t$ and $t+D$ all nodes learn of all existing blocks (as no new ones are manufactured), and therefore each pair of leaves (of the block tree) that have nodes actively trying to extend them must have equal weight subtrees rooted at some common ancestor. A single block is then created which breaks these ties, and another $D$ time units allow it to propagate to all nodes, which causes them to switch to a single shared history. Notice that $Pr(\epsilon_t)$ is uniformly (in $t$) lower bounded by a positive number, as it doesn't depend on $t$ (as the exponential distribution is memoryless). Hence the expected waiting time for the first $\epsilon_t$ event is finite (see "Awaiting the almost inevitable" in [19], Chapter 10.11). Finally, the stopping time $\psi_B$ is upper bounded, by definition, by the waiting time for the first $\epsilon_t$, implying $E[\psi_B] < \infty$.
In introduction it's said that
The advantage of this suggested change to the protocol is that it maintains the security threshold for successful 50% attacks at 1 (rather than $\frac{\beta}{\lambda_{h}}$), even if the network suffers from extreme delays and the attacker does not.
What if I increase the block creation frequency to the point where the period between blocks is shorter than $D$? Then surely next block will be created before $t+D$ and event $\epsilon_t$ doesn't need to ever happen, so why do the authors claim that the probability is lower bounded by a positive number?
The next block almost always be created before t+D but not always. The block creation time is an average - there are times in which all miners will get unlucky.