This is from Klenke's Probability Theory, Chapter $7.1$. Suppose that $f\in L^p(\mathbb{R})$ for $p\in (1,\infty)$. Show that if $T:\mathbb{R}\rightarrow \mathbb{R}$ is the map $x\mapsto x+1$, then $$\frac{1}{n}\sum_{k=0}^{n-1} f\circ T^k\rightarrow 0$$
in $L^p(\mathbb{R})$. My first idea was to check that this works for maybe some more specific subspaces of $L^p$ and then appeal to density in some way. I was able to see that it works for the indicator of a closed interval. Then I tried to look at a continuous, compactly supported function, say with support in $[-M,M]$ for some $M\in \mathbb{N}$. Now I think that if you break up $\mathbb{R}$ into intervals like $[-M,M],[-3M,-M],...$ then depending on where $x$ lives, you can sort of bound how high $n$ can go in the sum $\frac{1}{n}\sum_{k=0}^{n-1} f(x+k)$. There is some periodicity happening here I think, but I don't know how to turn this into a precise argument. Furthermore, supposing that this line of reasoning can be followed through, how can I pass from a continuous, compactly supported function to a generic $L^p$ function? If this is not possible, then please let me know a hint towards another strategy.
Hint: $$\|\frac{1}{n}\sum_{k=0}^{n-1} (f\circ T^k-g\circ T^k)\|_p \leq \|f-g\|.$$ This shows that it is enough to prove the result for $f$ in a dense subset of $L^{p}$. If $f=\chi _{(a,b)}$ then the intervals $(-k,1-k)$ are disjoint and you compute the $L^{p}$ norm of $\frac{1}{n}\sum_{k=0}^{n-1} (f\circ T^k)$ explicitly and see that this norm tends to $0$. Since step functions are dense in $L^{p}(\mathbb R)$ we are done.