Somewhere on the internet (I don't quite remember where) I came across the following formulation for finding the four parameters $a$, $b$, $c$, and $d$ of a Möbius transformation given three points $w_1$, $w_2$, $w_3$ and their images $z_1$, $z_2$, $z_3$:
$$\left[\matrix{a & b \\ c & d}\right] = \left[\matrix{z_2-z_3 & z_1z_3-z_1z_2 \\ z_2-z_1 & z_1z_3-z_2z_3}\right]^{-1}\cdot\left[\matrix{w_2-w_3 & w_1w_3-w_1w_2 \\ w_2-w_1 & w_1w_3-w_2w_3}\right]$$
I am interested in learning how this expression was derived (if you have any proper references for this or the starting equations I used below, I would greatly appreciate it). I am starting from the following equation on Wikipedia:
$$f_1(z)=\frac{(z-z_1)(z_2-z_3)}{(z-z_3)(z_2-z_1)}$$
which is a Möbius transformation which maps $z_1\rightarrow0$, $z_2 \rightarrow1$, and $z_3\rightarrow\infty$. That's relatively easy to verify. Again, according to wikipedia, it is possible to express the coefficients of a Möbius transformation as a matrix:
$$\mathcal{H}=\left[\matrix{a & b \\c & d}\right]$$
from which we can extract the Möbius transformation as:
$$f(z)=[z,1]\left[\matrix{a & b \\c & d}\right]=[az+b,cz+d]=\left[\frac{az+b}{cz+d},1\right]$$
First question: I'm trying to make sense of this matrix representation. What exactly is going on here? The dot product makes sense, but what is that last operation? I could interpret this as some form of normalization in the second dimension, but if that were the case, why is there an equal sign between the unnormalized and normalized version? And why did we start from $[z,1]$?
Simply accepting this representation and moving on, it makes sense that we could also express the coefficients of that map above in terms of such a $2$-by-$2$ matrix (I can see how the dot product above returns the function $f_1$):
$$\mathcal{H}_1=\left[\matrix{z_2-z_3 & -z_1(z_2-z_3) \\ z_2-z_1 & -z_3(z_2-z_1)}\right]=\left[\matrix{z_2-z_3 & z_1z_3-z_1z_2 \\ z_2-z_1 & z_1z_3-z_2z_3}\right]$$
If we then do the same trick for a second set of points $w_1$, $w_2$, $w_3$, we obtain an equivalent matrix
$$\mathcal{H}_2=\left[\matrix{w_2-w_3 & -w_1(w_2-w_3) \\ w_2-w_1 & -w_3(w_2-w_1)}\right]=\left[\matrix{w_2-w_3 & w_1w_3-w_1w_2 \\ w_2-w_1 & w_1w_3-w_2w_3}\right]$$
I assume that the initial equation connects these two Möbius transformations through the map to $0$, $1$, and $\infty$, using the foward version $f_1(z)$ to map $z_1$, $z_2$, $z_3$ to $0$, $1$, and $\infty$, and the inverse $f_2^{-1}(w)$ to map $0$, $1$, and $\infty$ to $w_1$, $w_2$, $w_3$. Apparently, these two distinct operations can be simplified by taking the dot product of the coefficient matrices:
$$\mathcal{H}=\left[\matrix{a & b \\c & d}\right]=\mathcal{H}_1^{-1}\cdot\mathcal{H}_2=\left[\matrix{z_2-z_3 & z_1z_3-z_1z_2 \\ z_2-z_1 & z_1z_3-z_2z_3}\right]^{-1}\cdot\left[\matrix{w_2-w_3 & w_1w_3-w_1w_2 \\ w_2-w_1 & w_1w_3-w_2w_3}\right]$$
Second question: Why is this possible? A guess: Linear transformations can be expressed as matrices, and chained linear transformations can be simplified by chaining the matrices corresponding to these operations?
I would greatly appreciate it if someone with a more thorough background in complex algebra could double-check my reasoning and fill any gaps I have left. Since the Wikipedia article also doesn't really have any references, I would also really appreciate it if you had any references which explain where they obtained these expressions from.