TLDR: I don't fully understand the bits in bold. I have attempted to answer the first bold part in an answer below.
I'm reading Lectures on Sieve Methods by Richert. He defines the following
Let $\gamma$, be a set of$|\gamma|$ integers in an interval of length $N.$ $\gamma \subset (M, M+N), M \in \mathbb{Z}, N \in \mathbb{N}.$
Define the following:
- $S:= |\gamma|$
- $S(q,l):= \sum_{\substack{n \in \gamma}\\ n \equiv l\mod q}1$
- $\sum_{l=1}^{q}S(q,l) = S$
- $\sum_{l=1}^{q}(S(q,l) - \frac{S}{q})=0$
So we have that if one counts the points in gamma mod q and subtracts out the entire size of gamma divided by q we obtain 0. This makes sense.
It then restricts to the case $q=p$. Using the final bullet point it goes on to define the following quantity:
$D(p):=\sum_{l=1}^{p}(S(p,l)- \frac{S}{p})^{2}$
Richert then states "$D(p)$ measures how uniformly $\gamma$ is distributed among the residue classes mod p"
I don't understand how the above statement follows, or why the square factor is required in the definition of $D(p).$
Defining the following uniform bound $\sum_{p \leq Q}pD(p) \leq K(N,Q,S).$ Where uniform here means "While $K$ may depend on $N,Q,S$ it should be independent of the particular structure of the set $\gamma$." Richert makes the following remark
"Let $N,Q,S$ be given. If every set $\gamma$ is so uniformly distributed over the residue class mod p as expressed by the given uniform bound (he calls it D), then for most of the p's $D(p)$ must be small. This remark tells us since $D(p)$ is bound to be large if many residue classes mod p do not contain any element of $\gamma$, the statement that $S(p,l)=0$ for 'many' l's mod p can be true for only 'few' ('exceptional') primes $p \leq Q$."
Thinking about this more, we are counting all the elements of gamma which are congruent to modulo p. Since $1 \leq l \leq p$ we look at all the residue classes mod p. This gives a count of elements in each residue class mod p.
The $\frac{S}{p}$ term gives the proportion of values in gamma that are divisible by the prime $p$. In other words, it is giving the number of composite values in $\gamma$ that contain $p$ as a factor.
I guess the square is there to ensure the result is positive.
I am slightly confused in the case where say we look at $D(3)$ with, say $\gamma = (1,12)$ i.e. $|\gamma|=10$. Then we have
$$D(3) = \sum_{l=1}^{3}\Big(S(3,l)- \frac{10}{3}\Big)^{2}$$ $$ = \Big(S(3,1) - \frac{10}{3}\Big)^{2} +\Big(S(3,2) - \frac{10}{3}\Big)^{2} + \Big(+S(3,3) - \frac{10}{3}\Big)^{2}$$ $$ = \Big(3 - \frac{10}{3}\Big)^{2} + \Big(4 - \frac{10}{3}\Big)^{2} + \Big(3 - \frac{10}{3}\Big)^{2}$$ $$ = \frac{2}{3}$$
I'm not sure what the interpretation of this number is though.
The smaller the value of $D(p)$ the better an approximation $\frac{S}{p}$ is for $S(p,l).$ I just don't quite see the connection of how uniformity fits into this.
For $D(p)$ being small [second bold part of original post], is this because the sum is bounded by $K$. So there cannot be too many of the $p$'s that are very big, otherwise the bound wouldn't be true?
For the $D(p)$ being large if many residue classes mod p don't contain elements of $\gamma$ is this because in that case you are simply squaring the quantity $\frac{S}{p}$. Therefore if $S$ is large relative to $p$ then $D(p)$ is large? Therefore $S(p,l)=0$ for 'many' $l$'s mod p can only be true for 'few' primes $p \leq Q$ since, the bound controls the overall size of the sum?