I'm getting stuck on perhaps a simple step in the Hille-Yosida theorem from 13.37 in Rudin's functional analysis. I wonder if someone has had this same difficulty before or knows how to get around it -
Setup: $A$ is a densely defined operator with domain $\mathcal{D}(A)$ in a Banach space $X$ and there are constants $C, \gamma >0$ such that for every $\lambda > \gamma$ and $m\in \mathbb{N}$, $$ \| (\lambda I - A)^{-m}\| \leq C(\lambda - \gamma)^{-m}. $$ The claim is then that $A$ is the infinitesimal generator of a semi-group of operators. For small $\varepsilon$, the bounded operator $S(\varepsilon)$ is defined to be $(I-\varepsilon A)^{-1}:X \to \mathcal{D}(A)$. It follows from the definitions that $AS(\varepsilon) = \varepsilon^{-1}(S(\varepsilon)-I)$ from which one can show that $e^{tAS(\varepsilon)}$ converges weakly to a bounded $Q(t)$. Moreover, $\{Q(t) \}$ gives a semi-group. Thus it has an infinitesimal generator $\tilde{A}$. Using the resolvent formulas for $AS(\varepsilon)$ and $\tilde{A}$, we have for all $x$ and $\lambda$ sufficiently large, $$ (\lambda I - \tilde{A})^{-1}x = \int_0^\infty e^{-\lambda t} Q(t)x dt $$ and $$ (\lambda I - AS(\varepsilon))^{-1}x = \int_0^\infty e^{-\lambda t} e^{tAS(\varepsilon)}x dt. $$ One can easily justify the limit $$ \lim_{\varepsilon \to 0} \int_0^\infty e^{-\lambda t} e^{tAS(\varepsilon)}x dt = \int_0^\infty e^{-\lambda t} Q(t)x dt. $$
Question: In order to compare $\tilde{A}$ and $A$, how does one see that $$ \lim_{\varepsilon \to 0} (\lambda I - AS(\varepsilon))^{-1}x = (\lambda I - A)^{-1}x? $$
THANK YOU!
First, multiply your condition on $A$ by $\lambda^m$ and substitute $\lambda=\frac{1}{\epsilon}$; we have $$\|(I-\epsilon A)^{-m}\|\leq C\left(1-\epsilon\gamma\right)^{-m}\to1 \tag{1}$$ as $\epsilon\to0$. In particular, when $m=1$, we see that $S(\epsilon)$ is uniformly bounded.
Second, note that the argument to show that $$\epsilon AS(\epsilon)=S(\epsilon)-1$$ also shows $$\epsilon AS(\epsilon)=S(\epsilon)-1=\epsilon S(\epsilon)A \tag{2}$$ and so $S(\epsilon)$ commutes with $A$.
Third, note that $$X(X^{-1}-Y^{-1})Y=(Y-X)^{-1}\tag{3}$$ Then use (3) to compute the difference between your left and right side: \begin{align*} \require{cancel} (\lambda-AS(\epsilon))^{-1}x-(\lambda-A)^{-1}x&=(\lambda-AS(\epsilon))^{-1}\cdot(\cancel{\lambda}-A-(\cancel{\lambda}-AS(\epsilon)))\cdot{}\\ &\phantom{{}={}}\quad\quad\quad(\lambda-A)^{-1}x \\ &=(\lambda-AS(\epsilon))^{-1}\cdot A(S(\epsilon)-1)\cdot(\lambda-A)^{-1}x \\ &=(\lambda-AS(\epsilon))^{-1}\cdot(A\cdot\epsilon AS(\epsilon))\cdot(\lambda-A)^{-1}x \\ &=(\lambda-AS(\epsilon))^{-1}\cdot\epsilon S(\epsilon)\cdot A^2(\lambda-A)^{-1}x \tag{4} \end{align*} where (4) applies the commutation relation from (2).
The first term is a bounded operator (since $\lambda$ is sufficiently large) and the last is fixed in $\epsilon$. By (1), the middle is going to $0$, and so (4) as a whole tends to $0$.