The Milnor-Wolf theorem says that a finitely generated solvable group that doesn't have exponential growth is virtually nilpotent.
The proof I've seen is divided into two pieces:
- Prove that such a solvable group is virtually polycyclic. This is proved in Milnor's paper, which is only a few pages, and (unsurprisingly) extremely readable.
- Prove that a polycyclic group that doesn't have exponential growth is virtually nilpotent. The only reference I can find for this is in Wolf's original paper [Theorem 4.3], but it's proved in the larger context of what Wolf is doing, so makes use of some things I'm not super familiar with (simply connected Lie groups, Mostow's results, etc.).
Is there a proof of the second part that is more in the same spirit of Milnor's? That is, purely group-theoretic, and relatively easy to follow? I think (though I could be wrong) that by induction on Hirsch length this can be reduced to the problem of showing:
If $G=N\rtimes\mathbb{Z}$ with $N$ finitely generated nilpotent and $G$ not of exponential growth, then $[G:C_G(N)]<\infty$.
Here is a proof based on Proposition 14.28 in Geometric Group Theory by Drutu and Kapovich.
Appetizer
First, we need some lemmas about a particular case of the question, when $G=\mathbb{Z}^n\rtimes_A\mathbb{Z}$; here $\mathbb{Z}=\langle t\rangle$ is acting by the matrix $A\in GL(n,\mathbb{Z})$: $t^{-1}vt=Av$.
Lemma 1: If $A\in GL(n,\mathbb{Z})$ has every eigenvalue of norm $1$, then every eigenvalue is a root of unity.
Proof: Let $\chi(x)=\sum_{j=0}^n a_{j}x^j$ be the characteristic polynomial of $A$. If $\{\lambda_i\}$ are the eigenvalues of $A$, then by assumption $\lvert\lambda_i\rvert=1$ for all $i$. Since $a_j$ is the sum of $\binom{n}{k}$ products of its roots, then $\lvert a_j\rvert\le\binom{n}{k}$ for all $j$. The eigenvalues of $A^k$ are $\{\lambda_i^k\}$, and since they also all have norm $1$, it's also true that its characteristic polynomial $\chi_k(x)=\sum_{j=0}^n a_{k,j}x^j$ has $\lvert a_{k,j}\rvert\le\binom{n}{k}$. Since these coefficients are integers, there are only finitely many such polynomials. This means there's $k<m$ with $\chi_k(x)=\chi_m(x)$. Considering roots, $\{\lambda_i^k\}^{m/k}$ is a permutation on $\{\lambda_i^k\}$. Since this is finite permutation, some finite power of this permutation is the identity, which shows there is a rational number $q$ with $\lambda_i^{qk}=\lambda_i^k$. But then $\lambda_i^{k(q-1)}=1$ for all $i$, so that each $\lambda_i$ is a root of unity.$\square$
Lemma 2: If $A\in GL(n,\mathbb{Z})$ has all eigenvalues equal to $1$, then there is a normal series of subgroups \begin{equation} \{1\}=N_0<N_1<\ldots<N_k=\mathbb{Z}^n \end{equation} where $A(N_i)=N_i$, $A$ acts as the identity on $N_i/N_{i-1}$, and each $N_i/N_{i-1}$ is torsion-free.
Proof: Take $N_i=\ker(A-I)^i$. If $v\in N_i$, then $(A-I)v\in N_{i-1}$, so that $Av=v$ in $N_i/N_{i-1}$. Of course $(A-I)^iAv = A(A-I)^iv=0$, so $A(N_i)=N_i$. Finally, if $mv\in N_{i-1}$ for some integer $m\neq0$, then $0=(A-I)^{i-1}mv=m(A-I)^{i-1}v$ shows $v\in N_{i-1}$, so that $N_i/N_{i-1}$ is torsion-free.$\square$
Lemma 3: If $A\in GL(n,\mathbb{Z})$ has an eigenvalue $\lambda$ with $\lvert\lambda\rvert\ge2$, then the semidirect product $G=\mathbb{Z}^n\rtimes_A\mathbb{Z}$ has exponential growth.
Proof: Choose a nonzero vector $w\in\mathbb{C}^n$ that is an eigenvector of $A^\ast$ with eigenvalue $\overline{\lambda}$. Then choose $v\in\mathbb{Z}^n$ with $\langle v,w\rangle\neq0$. We now prove that the map \begin{equation} \Phi:\mathbb{F}_2[x]\rightarrow\mathbb{Z}^n \end{equation} given by $\Phi(p(x))=p(A)v$ is injective. If not, let $\Phi(q(x))=0$ with $\deg(q)=m>0$. Then \begin{align} 0 &= q(A)v\\ &= \langle q(A)v,w\rangle\\ &= \langle v,q(A^\ast)w\rangle\\ &= q(\lambda)\langle v,w\rangle \end{align} So $q(\lambda)=0$. But this is impossible, since if $q(x)=\sum_{i=0}^m a_ix^i$ with $a_i\in\{0,1\}$, \begin{align} \lvert\lambda^m\rvert &= \lvert\lambda^m-q(\lambda)\rvert\\ &\le\sum_{i=1}^{m-1}\lvert a_i\lambda^i\rvert\\ &\le \frac{\lvert\lambda\rvert^m-1}{\lvert\lambda\rvert-1}\\ &\le \lvert\lambda\rvert^m-1 \end{align} which is absurd.
Now if $P_k=\{p(x)\in\mathbb{F}_2[x]: \deg(p)\le k\}$, then $\lvert\Phi(P_k)\rvert=\lvert P_k\rvert=2^{k+1}$. In $G$, we can write \begin{equation} \Phi(p(x)) = \prod_{i=0}^k (t^{-i}vt^i)^{a_i} \end{equation} And an easy induction on $k$ shows that in a generating set of $G$ containing $\{v,t\}$, this product has word length at most $3k+1$. So $\lvert B_{3k+1}(1)\rvert\ge2^{k+1}$, and $G$ has exponential growth.$\square$
Main Course
Now we can prove Wolf's theorem! Let $G$ be a polycyclic group without exponential growth; recall that being polycyclic means there's a normal series \begin{equation} 1=N_0<N_1<\ldots<N_d=G \end{equation} where each factor is cyclic. Since cyclic groups are nilpotent, we can proceed by induction on $d$. If $G/N_{d-1}$ is finite, then by induction $N_{d-1}$ is virtually nilpotent, and thus so is $G$. So we can assume that $G/N_{d-1}\cong\mathbb{Z}$. Since $\mathbb{Z}$ is free, we have $G=N_{d-1}\rtimes\mathbb{Z}$, with $N_{d-1}$ virtually nilpotent. If $N\le N_{d-1}$ is nilpotent with $[N_{d-1}:N]<\infty$, then since $N_{d-1}$ is finitely generated, we can intersect all subgroups of the same index in $N_{d-1}$ to get a characteristic nilpotent subgroup $K\le N_{d-1}$ such that $[N_{d-1}:K]$ is still finite. So there's no loss of generality in assuming $G=K\rtimes H$, where $K$ is nilpotent and $H$ is infinite cyclic (generated as above by $t$).
$K$ has a lower central series \begin{equation} 1 = K_c < K_{c-1} < \ldots < K_1=K \end{equation} where each factor group is finitely generated Abelian. Each factor has a characteristic torsion subgroup, so that we can refine this series to \begin{equation} 1 = K_c < T_{c-1} < K_{c-1} < \ldots < T_1 < K_1=K \end{equation} with $T_i/K_{i+1}$ a finite Abelian group, and $K_i/T_i$ a finitely generated torsion-free Abelian group of rank $r_i$.
First we show that if the matrix $A_i\in GL(r_i,\mathbb{Z})$ represents the action of $H$ on $K_i/T_i$, then all eigenvalues of $A_i$ have norm $1$. If not, then there's a (possibly negative) power $A_i^k$, represented by $t^k$, that has an eigenvalue with norm at least $2$. By Lemma 3 above, the subquotient $\langle K_i,t^k\rangle/T_i$ has exponential growth, which implies $G$ does too (since the growth of subgroups/quotients "lower bounds" the growth of the group).
So all the eigenvalues of all the $A_i$ have norm $1$. We can choose an appropriate power of $t$ (say $t^m$) such that its action on each $T_i/K_{i-1}$ is trivial. By Lemma 1, we can also pick that power of $t$ such that each $A_i$ has all eigenvalues equal to $1$.
By Lemma 2 then, by working with each $K_i/T_i$, we can refine the central series of $K$ to look like \begin{equation} 1 = K_c < T_{c-1} < K_{c-1,1} < K_{c-1,2} < \ldots K_{c-1} < \ldots < T_1 < K_{1,1} < \ldots < K_1=K \end{equation} where $t^m$ acts trivially on each factor. Relabel the above central series as $\{K_j\}$, still with $K_1=K$. If $L=\langle K,t^m\rangle$, then $L$ is a finite index subgroup of $G$, and we can prove that $L$ is nilpotent. Since $L/K$ is Abelian, $[L,L]\le K_1$. If $K_j$ is one of the groups in the central series above, with $g\in K_j$ and $kt^{im}$ a generic element of $L$, then we have \begin{equation} [g,kt^{im}] = [g,t^{im}](t^{-im}[g,k]t^{im})\in K_{j+1} \end{equation} This is because $[g,t^{im}]\in K_{j+1}$ since $t^m$ acts trivially on $K_j/K_{j+1}$, and $[g,k]\in K_{j+1}$ since $\{K_j\}$ is a central series for $K$. By induction then, for the lower central series of $L$, $L_k\le K_{k-1}$ for all $k$, and in particular there exists a $k$ with $L_k=1$, so $L$ is nilpotent, and $G$ is virtually nilpotent.