There are some steps that I don't understand in the following problem, It would be great it someone could clear them up, I've been days on this:
How is the first inequality in (1.15) derived?
What do they mean with taking $b \searrow 1$ along a sequence? How is that any different from just taking a "normal" $\lim_{b\to 1}b$ ?
How is (**) derived?
Looks like there's a contradiction in the conclusion of Borel-Cantelli lemma.
In (*) they said "with probability one,$ R_{m^r}\le \cal l_m$" for large enough $m$ , which means infinitely often correct? and I agree is the result by looking at (1.15) and realizing that suming over r yields a convergent geometric series
In (***) they again use Borel Cantelli, to say the opposite thing: "by the Borel cantelli lemma, with probability one,$ R_{n}< \cal l_n$ happens only finitely often, with probability one" . Looks like this second statement is wrong?


For 1: As user51547 mentioned, this is just a union bound. If $R_{m^r} \ge \ell_m$, so that $X_1, \dots, X_{m^r}$ contains a run of length $\ell_m$, then there must exist some starting value $i \le m^r$ such that $X_i = \dots = X_{i+\ell_m - 1} = 1$. (Call this statement S.)
As you mention in comments, there will in fact be an $i$ which satisfies not only $i \le m^r$ but indeed $i \le m_r - \ell_m$. But that doesn't make our claim any less true: it's claimed to be an implication, not an equivalence. In working with estimates, it's very common to be "sloppy" and use a fact that is weaker than something else that is also clearly true, if it is still strong enough to get the desired conclusion and makes the subsequent steps simpler. If we had to work with $m_r - \ell_m$ instead of just $m_r$ in all the later steps, the algebra would get more complicated.
So restating the above statement S in terms of events, we have $$\{ R_{m^r} \ge \ell_m\} \subset \bigcup_{i=1}^{m^r} \{X_i = \dots = X_{i+\ell_m-1} = 1\}.$$ (Remember that a logical statement like "if A then B" translates into the subset relation $A \subset B$ on the corresponding events, and "there exists" translates into a union.) So by union bound $$P(R_{m^r} \ge \ell_m) \le \sum_{i=1}^{m^r} P(X_i = \dots = X_{i+\ell_m-1} = 1).$$ Since the $X_n$ are iid, for any $i$ we have $P(X_i = \dots = X_{i+\ell_m-1} = 1) = p^{\ell_m}$, and so the sum on the RHS simply equals $m^r p^{\ell_m}$.
For (2): Note that the preceding inequality $\limsup_{n \to \infty} \frac{R_n}{\log n} \le -\frac{b}{\log p}$ has an implicit "almost surely". In other words, if we let $E_b = \{ \limsup_{n \to \infty} \frac{R_n}{\log n} \le -\frac{b}{\log p} \}$, we have shown that for every $b$ we have $P(E_b) = 1$.
Now your "normal limit" strategy would translate to the following. We know from elementary arithmetic that for a given real number $x$, if for every $b > 1$ we have $x \le -\frac{b}{\log p}$, then we have $x \le -\frac{1}{\log p}$. (Keep in mind here that $\log p < 0$.) As such, for every $\omega \in \bigcap_{b > 1} E_b$, we have $\limsup \frac{R_n(\omega)}{\log n} \le -\frac{b}{\log p}$ for all $b > 1$, and consequently $\limsup \frac{R_n(\omega)}{\log n} \le -\frac{1}{\log p}$. But unfortunately $\bigcap_{b > 1} E_b$ is an uncountable intersection, so it's not even obvious that it is measurable, let alone that $P(\bigcap_{b > 1} E_b) = 1$ which is what we would need in order to conclude.
To fix this, the standard trick is to "take the limit along a sequence", which is what the book is suggesting. Fix any sequence $b_k \searrow 1$, i.e. such that $b_k > 1$, $b_k \to 1$ as $k \to \infty$, and $b_k$ is monotone decreasing (we don't actually need the last condition here). For example, we could take $b_k = 1+1/k$. Now we can use countable additivity to say that $E := \bigcap_{k=1}^\infty E_{b_k}$ is measurable and that $P(E) = 1$. And for $\omega \in E$ we have $$\limsup \frac{R_n(\omega)}{\log n} \le -\frac{b_k}{\log p} \quad \forall k$$ which by elementary calculus implies $$\limsup \frac{R_n(\omega)}{\log n} \le -\frac{1}{\log p}.$$ So we have the desired conclusion that $$P\left(\limsup \frac{R_n(\omega)}{\log n} \le -\frac{1}{\log p}\right) = 1.$$
For (3), let's just translate the set statement back into logic. If $R_n < \ell_n$, then $X_1, \dots, X_n$ does not contain any success run of length $\ell_n$. In particular, for any $0 \le i \le \lfloor n/\ell_n\rfloor - 1$, the $\ell_n$ trials numbered $i \ell_n + 1, \dots, (i+1)\ell_n$, do not form a success run, which is to say that we do not have $X_{i \ell_n + 1} = \dots = X_{(i+1)\ell_n} = 1$; so the event $A_i$ has not happened. In set language, this says that $\{R_n < \ell_n\} \subset A_i^c$. Since $i$ was arbitrary, we in fact have $\{R_n < \ell_n\} \subset \bigcap_{i=0}^{\lfloor n/\ell_n \rfloor - 1} A_i^c$ which is (**).
As before, the "in particular" points out where we use a statement weaker than another one that is obviously true, but which is still enough for our purposes and more convenient. Again, we only need implication and not equivalence.
For 4: No, "for large enough $m$, XXX" means "there exists $M$ such that for all $m \ge M$ we have XXX". That's equivalent to saying "for all but finitely many $m$, we have XXX", or as some people would say, "XXX almost always". This is strictly stronger than "XXX infinitely often". (To keep them straight, it's nice to think about the statement "$m$ is even": it is true infinitely often, but not true for all large enough $m$. It's the difference between saying a subset of $\mathbb{N}$ is infinite, and saying that it is cofinite.)
And this is in fact what the proof gives us. Applying Borel-Cantelli to (1.15), we have $P(R_{m^r} \ge \ell_m \text{ i.o.}) = 0$. De-Morganizing this gives us $P(R_{m^r} < \ell_m \text{ a.a.}) = 1$. (If you want to convince yourself, rewrite in terms of unions and intersections.) That is actually stronger than the underlined statement (*) which amounts to $P(R_{m^r} \le \ell_m \text{ a.a.})$, i.e. with $\le$ instead of $<$. (Again, no harm in being sloppy if you end up where you want to be.)
The statement (***) is also fine. Borel-Cantelli shows that the probability of $R_n < \ell_n$ infinitely often is 0, which is to say that the probability of $R_n < \ell_n$ only finitely often is 1. We could also, as before, phrase this as "with probability 1, $R_n \ge \ell_n$ almost always". And so we do get the desired statement about the liminf.