First let me state the Von Neumann ergodic theorem.
Suppose $U$ is a unitary operator on a Hilbert space $H$ and let $P$ be the orthogonal projection onto $\ker (U- I)$. Then we have for all $f \in H$ $$ \lim_{N \to \infty} \frac{1}{N} \sum_{n=0}^{N-1} U^n f = Pf.$$
This theorem I could prove with relative ease. But I found the following related exercise in my notes. Suppose we write $U = e^{iH}$ for some self-adjoint operator $H$ and we assume $\text{dist}( \sigma(H), 2\pi \mathbb{Z}) = \epsilon >0,$ where $\sigma(H)$ is the spectrum of $H$, what can we then say about the rate of convergence of the limit in the Von Neumann ergodic theorem?
A little note on $\text{dist}( \sigma(H), 2\pi \mathbb{Z})$: It isn't explained in the notes but I assume this is some kind of Hausdorff distance. You simply only consider $2 \pi \mathbb{Z} \cap [ - \lVert H \rVert, +\lVert H\rVert$], making the set compact and thus the Hausdorff distance well-defined. But I could be wrong, please tell me if you know the correct (or better) definition. I also suspect that due to the spectral mapping theorem, we would find that $\text{dist}(\sigma(U), 1) = \delta$ for some $\delta > 0$ but I'm not entirely sure.
Any help is much appreciated.
UPDATE:
I think I may have found a simple counterexample, in the sense that there's an operator $U_1$ without a spectral gap and $U_2$ with a spectral gap but in both cases the limits converge at the same rate. Consider $H = \mathbb{C}^2$ the following two unitary operators $$U_1 = \left( \begin{matrix} 1 & 0 \\ 0 & -1 \end{matrix} \right), U_2 = e^{i\alpha} I,$$ where $\alpha$ is some irrational number, ensuring that $e^{ni\alpha}$ never becomes 1. $U_1$ has an eigenvalue 1, so $\text{dist}(\sigma(U_1),1) = 0$ but $U_2$ only has eigenvalue $e^{i\alpha}$, so $\text{dist}(\sigma(U_1),1) >0$. Since the space is finite dimensional, we might as well look at convergence in operator norm. Note that $\ker (U_1- I) =\text{span} \{ e_1 \}$ where $e_1,e_2$ are the standard basis. Let $P$ be the projection onto this subspace. Then we see for odd $N$ $$ \lVert \frac{1}{N} \sum_{n=0}^{N-1} U_1^n - P \rVert = \lVert \left( \begin{matrix} 0 & 0 \\ 0 & \frac{1}{N}\sum_{n=0}^{N-1}(-1)^n \end{matrix} \right)\rVert = \frac{1}{N} $$ but $0$ for even $N$. Similarly for odd $N$ we have $$ \lVert \frac{1}{N} \sum_{n=0}^{N-1} U_2^n \rVert= \lVert \left( \begin{matrix} \frac{e^{iN\alpha} - 1}{N(e^{i\alpha} -1)} & 0 \\ 0 & \frac{e^{iN\alpha} - 1}{N(e^{i\alpha} -1)} \end{matrix} \right) \rVert = \frac{e^{iN\alpha} - 1}{N(e^{i\alpha} -1)} = O\left(\frac{1}{N}\right)$$ but $0$ for even $N$. So they both converge just as fast, despite one having a spectral gap and the other not. Please let me know if there's anything wrong with this argument.
You are right about the distance, all is as you say.
On the rest, I suggest that you write $U$ in the form $$ U=\int_{S^1}\lambda\,d\mu(\lambda) $$ and do the computations using this form: for $f$ in the kernel of $U-I$ the average is $$ \frac{1}{N} \sum_{n=0}^{N-1} U^n f=f+\frac{1}{N} \sum_{n=1}^{N-1} \int_{S^1}\lambda^n\,d\mu(\lambda). $$ In view of the existence of $\delta$, the second term converges to zero with a speed related to the spectral gap with respect to $1$. That gives the speed that you want.
Notice that without the information about $\delta$ the second term still converges to zero (use dominated convergence), but you don't get anything about the speed.