If we let $\pi_{a,d}(x) = \{p \leq x: p \mbox{ prime, } p \equiv a \bmod d\}$ then it is a well known result that if $(a,d)=1$ then
$$\lim_{x \to \infty} \frac{\pi_{a,d}(x)}{\pi(x)} = \frac{1}{\phi(d)}$$
That is, the primes fall evenly into each congruence class mod $d$ that infinitely many of them can fall into. My question is, aside from the proof of this fact, why would one think this should be true? Are there any simple heuristics that would get this? Or, is there a heuristic that would suggest primes favour one congruence class over another but fails one way or another?
Any reference or answer would be appreciated.
An heuristic thinking is to model the primes as a "convenient random sequence". It is an essentially random sequence, except that you remove obvious non-features of it, like infinitely many even primes. For example, there is a very obvious reason sequences like $4+32k$ or $7+56k$ cannot generate infinitely many primes. This already excludes all progressions $a+kd$ with $(a,d)>1$ from our attention.
If you see the sequence of primes $\pmod d$, morally it should fall randomly in every class coprime to $d$. Primes should have no preferences to a point sequences like $8k+3$ or $8k+5$ have finitely many primes. There are $\varphi (d)$ such classes, so the random model idea naturally expects a $1/\varphi(d)$ density.
It is not hard to show that there are infinitely many $4k\pm1$ primes. Many other special cases are easy to settle way before Dirichlet's proof, and these little clues do attract attention and raise suspicion. Of course, you cannot be certain you are right until you get your hands on a proof. However, these heuristic ideas aren't worhthless and can point directions on phenomena we do not yet fully understand, such as square roots in certain $O$ bounds on analytic number theory.