What I understand so far
Easy cases: I understand we only care about odd $n$ since even $n$ is never prime unless $n=2$.
Basic setup: So if $n$ is odd, then $n-1$ is even, so we can write $n - 1 = 2^r d$ for positive $r, d$ (where $d$ is odd).
Fermat's little theorem: I understand that for some integer $a$ that isn't divisible by $p$, if $p$ is prime then $a^{p-1} \equiv 1 \pmod p$ must be true.
Euclid's lemma and modular squares: If we have $x^2 \equiv 1 \pmod p$ for some prime $p>2$, this rearranges to $(x-1)(x+1) \equiv 0 \pmod p$, meaning either $x \equiv 1 \pmod p$ is true or $x \equiv -1 \pmod p$ is true (but not both at the same time).
What I don't understand
The rest of it. Let's assume we have odd candidate $n > 2$, and let's also assume it happens to be prime, $n = p$.
Then $a^{p-1} \equiv 1 \pmod p$ can be rewritten as $a^{2^r d} \equiv 1 \pmod p$ by substituting the basic setup equation.
If I set $x^2 \equiv a^{p-1} \pmod p$, then $x \equiv \sqrt{a^{p-1}} \equiv a^{\frac{p-1}{2}} \equiv a^{\frac{2^rd}{2}} \equiv a^{2^{r-1}d} \pmod p$ as long as $r>0$.
Once $r=0$, I don't think we can go any further since $d$ is odd, but I don't know if it's still possible somehow to keep going, i.e. can I do $x \equiv \sqrt{a^{2k+1}} \pmod p$?
Anyways, I know from the Euclid / modular square step that this means $a^{2^{r-1}d} \equiv 1 \pmod p$ or $a^{2^{r-1}d} \equiv -1 \pmod p$ (but I don't know if this also needs to apply to the original $a^{2^rd}$ as well).
My questions:
Did I make any mistakes so far in any of this?
Where do we go from here? Is the idea to start off with $a^{2^{r-1}d} \bmod p$ and make sure it always equals either $1$ or $-1$ modulo $p$ every time we decrease the exponent of $r$? If it ever equals something else, $n$ isn't prime?
What do we do if we hit $-1$? Where can we go from there, since we no longer have our $x^2 \equiv 1 \pmod p$ prerequisite?
Even if we choose a random $a < p$ and it passes all tests, do we just keep picking $a$ terms until something fails? Do we have to exhaustively tests all $1 < a < p$?
What about $a = 0$? I know this violates the Fermat's little theorem assumption, but is it ever possible that we have to try it to ensure the input is prime or not somehow if everything else fails?
The key in my opinion is understanding Miller-Rabin that for $p$ prime there are only two values that square to $1 \bmod p$ - $-1$ and $1$. We know that if $p$ is indeed prime then $a^{p-1}\equiv 1 \bmod p$ which means that we should find either that as we halve the exponent repeatedly, we either get $a^k\equiv 1$ all the way down until $k$ is odd, or the first variation we get is $a^k\equiv -1\bmod p$. These two outcomes are captured in the test in reverse direction by starting with the odd $k$; either that $a^k \equiv 1$ or $-1$, otherwise squaring we should reach $a^k \equiv -1 \bmod p$ at some point before $1$.
By contrast $p$ not prime can fail the test in various ways. Most obviously we might never reach either $-1$ or $1$, but also if we have some other number $q$ such that $q^2 \equiv 1 \bmod p$ then that also shows us that $p$ is not prime.
So, in direct response to part 2, you start from $a^d\bmod p$. Unless this value is $\equiv 1$ or $-1$, you can continue squaring for up to $r-1$ steps and the only value which witnesses a prime is $-1$. If you do not reach $a^{d\cdot 2^k}\equiv -1\bmod p$ at some point, $p$ is not prime.
By contrast if you reach $a^{d\cdot 2^k}\equiv -1\bmod p$, there is no value in going further, since subsequent values will automatically be $1$.
The idea is to pick enough values of $a$ to be sufficiently confident that $p$ is prime. Any one value only gives a limited assurance - that is why it is a witness, not a proof. In Miller Rabin the witness is true with at least $75\%$ probability (usually higher, but $75\%$ is guaranteed even for deliberately misleading choices of the test value). So using 10 choices of $a$ - which all witness to prime $p$ - will in principle give an assurance better than one in a million that $p$ is indeed prime.
$a=0$ (or $a=p$ etc.) doesn't give any useful information, since $a^k\equiv 0 \bmod p$ for all $k\in \Bbb N$.