The paper is the following:
https://arxiv.org/pdf/1608.06412.pdf
Right above (5) on page 5, the first inequality of the line where they use lemma 8, I'm not quite sure I see how they are using lemma 8.
Lemma 8: Given $d$-dimensional square diagonal matrix $D$ and arbitrary square $d$ dimensional matrix $M$, $\max_i \min_j |\sigma_i(D) - \sigma_j(M)| \leq ||D-M||_{op}$
The inequality in question is
$||(I + B_j(\Sigma + \lambda I)^{-1})^{-1}||_{op} \leq \frac{1}{1 - \frac{||B_j||_{op}}{\lambda}}$
where $B_j = -(X_j X_j^T + \lambda I)/n$ and $\Sigma = X^T X /n $ where $X$ is just some arbitrary $n \times d$ data matrix.
I assume lemma 7 is also useful:
$||(M + \lambda I)||_{op} \leq 1/\lambda$
From my guess, they just applied the inverse of lemma 8, however that means that there is an issue with norm of the inverse vs inverse of the norm which I can't resolve (the inequality between norm inverse and inverse norm goes the wrong direction). Then they use submultiplicative property of norm along with lemma 7 to get the rough form of the RHS. I'm just not sure how lemma 8 is exactly applied because of this issue. Am I missing some matrix identity? Possibly woodbury?