I am having trouble proving the Cauchy-Binet Theorem. I jotted down how far I got in the proof, but I just find myself stuck. Any guidance would be greatly appreciated!
I understand that
$\begin{align*}\det(AB) &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)) \\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}} b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)) \\ \end{align*}$
The last equations work as for any $J$, we will only consider the $j's$ to be all distinct (otherwise the determinant would be zero) and be integers that are between $1$ and $n$. Now, fix $J'=(j_1', j_2', ..., j_k')$ which organizes these $j's$ from least to greatest. Now, let $\sigma\in S_k$ and have $j'_i=j_{\sigma(i)}$ for $i=1, 2, ...,k$.
I'm not sure why $\sigma$ is a permutation of $[n]$ here instead of being in $S_k$ like how I defined it above? I thought $\sigma$ was defined here by looking at the index of $j$ and not by $j$ itself (so it isn't associated with n).
So, then I continue to get $\operatorname{sgn}(\sigma)\det(J')=\det(J)$. Thus, $j_i=j_{\sigma(\underbrace{\sigma^{-1}(i)}_{\in \{ 1, 2, ..., k\}})}=j'_{\sigma^{-1}(i)}$.
Thus, continuing our equation where we left off, we know $\begin{align*} \det(AB)&=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}} b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)) \\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}} b_{j_1,1}b_{j_2,2}...b_{j_k,k}\operatorname{sgn}(\sigma)\det(A(J'))\\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}}\operatorname{sgn}(\sigma^{-1}) b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J'))\\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}}\operatorname{sgn}(\sigma^{-1}) b_{j'_{\sigma^{-1}(1),1}}b_{j'_{\sigma^{-1}(2),2}}...b_{j'_{\sigma^{-1}(k),k}}\det(A(J'))\\ &= \text{and then I get confused here to show} = \sum_{J'}\det(A(J')\det(B(J')) \end{align*}$
As the determinant is a multilinear function (the notation is $D$ for the function of the determinant here), we know
$\begin{align*}\det(AB)&=\det((AB)_1, (AB)_2, ..., (AB)_k) \text{ where } (AB)_i \text{denotes the } i^{th} \text{column of } AB\\ &=\det(\sum_{i=1}^k\sum_{j_1=1}^na_{i,j_1}b_{j_1,1}\cdot \hat{e}_i,\sum_{i=1}^k\sum_{j_2=1}^na_{i,j_2}b_{j_2,2}\cdot \hat{e}_i , ..., \sum_{i=1}^k\sum_{j_k=1}^na_{i,j_k}b_{j_k,k}\cdot \hat{e}_i ) \\ &=\sum_{i_1,i_2, ..., i_k=1}^k\det(\sum_{j_1=1}^na_{i_1,j_1}b_{j_1,1}\cdot \hat{e}_{i_1},\sum_{j_2=1}^na_{i_2,j_2}b_{j_2,2}\cdot \hat{e}_{i_2} , ..., \sum_{j_k=1}^na_{i_k,j_k}b_{j_k,k}\cdot \hat{e}_{i_k} ) \\ &=\sum_{i_1,i_2, ..., i_k=1}^k\sum_{j_1,j_2, ...,j_k=1}^n\det(a_{i_1,j_1}b_{j_1,1}\cdot \hat{e}_{i_1},a_{i_2,j_2}b_{j_2,2}\cdot \hat{e}_{i_2} , ..., a_{i_k,j_k}b_{j_k,k}\cdot \hat{e}_{i_k} ) \\ &=\sum_{j_1,j_2, ...,j_k=1}^n\sum_{i_1,i_2, ..., i_k=1}^k\det(a_{i_1,j_1}b_{j_1,1}\cdot \hat{e}_{i_1},a_{i_2,j_2}b_{j_2,2}\cdot \hat{e}_{i_2} , ..., a_{i_k,j_k}b_{j_k,k}\cdot \hat{e}_{i_k} ) \\ &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\sum_{i_1,i_2, ..., i_k=1}^k\det(a_{i_1,j_1}\cdot \hat{e}_{i_1},a_{i_2,j_2}\cdot \hat{e}_{i_2} , ..., a_{i_k,j_k}\cdot \hat{e}_{i_k} ) \\ &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\sum_{i_1,i_2, ..., i_k=1}^k\det(A(J)_{i_1,1}\cdot\hat e_{i_1},A(J)_{i_2,2}\cdot\hat e_{i_2},\dots,A(J)_{i_k,k}\cdot\hat e_{i_k})\\ &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(j_1, j_2, ..., j_k)) \\ &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)). \end{align*}$
So, then note the following holds true which is explained below
$\begin{align*}\det(AB) &=\sum_{j_1,j_2, ...,j_k=1}^n b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)) \\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\} \text{ and all distinct}} b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(J)). \end{align*}$
The last equations work as for any $J$, we will only consider the $j's$ to be all distinct (otherwise the determinant would be zero) and be integers that are between $1$ and $n$. Now, fix $J'=(j_1', j_2', ..., j_k')$ which organizes these $j's$ from least to greatest. Now, consider $\sigma=\begin{pmatrix} j_1' & j_2' & \cdots & j_k' \\ j_1 & j_2 & \cdots & j_n \end{pmatrix}\implies \epsilon(j_1, j_2, ..., j_k)\det(A(J'))=\det(A(J)).$
So,
$\begin{align*}\det(AB)&=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\}} b_{j_1,1}b_{j_2,2}...b_{j_k,k}\epsilon(j_1, j_2, ..., j_k)\det(A(J'))\\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\}} \epsilon(j_1, j_2, ..., j_k) b_{j_1,1}b_{j_2,2}...b_{j_k,k}\det(A(j'_1, j'_2, ..., j'_k))\\ &=\sum_{j_1,j_2,...,j_k\in \{1, 2, ..., n\}} \epsilon(j_1, j_2, ..., j_k) b_{1,j_1}b_{2,j_2}...b_{k,j_k}\det(A(j'_1, j'_2, ..., j'_k))\\ &=\sum_{1\leq j'_1<...<j'_k\leq n}(\sum_{l_1, l_2, ..., l_n=1}^k \epsilon(l_1, l_2, ..., l_k) b_{l_1,j'_1}b_{l_2,j'_2}...b_{l_k,j'_k})\det(A(j'_1, j'_2, ..., j'_k))\\ &=\sum_{1\leq j'_1<...<j'_k\leq n}\det(B(j'_1, j'_2, ..., j'_k))\det(A(j'_1, j'_2, ..., j'_k)). \text{QED} \end{align*}$