Let $A = (a_{ij})$ be an $n\times n$ matrix with entries in a ring $R$, $M$ a free $R$-module of rank $n$ with an ordered basis $(e_i)_{i \leq n}$ and $\phi\colon M\to M$ is an endomorphism which $A$ represents (such that $\phi(e_j) = \sum_{i = 1}^n a_{ij}e_i$ fop all $j$). Then $\bigwedge^n M$ has rank $1$ with a basis $\{e_1\wedge ... \wedge e_n\}$ and there is a unique scalar $r \in R$ such that $$\left(\bigwedge^n \phi\right)(e_1\wedge ... \wedge e_n) = \phi(e_1)\wedge ... \wedge \phi(e_n) = r(e_1\wedge ... \wedge e_n).$$ This scalar is precisely the determinant $\det(A)$ of $A$. It can be easily shown that the value of $\det(A)$ doesn't depend on the choice of $M$. I take this as the definition of a determinant. From this other definitions can be deduced as theorems, including the formula $$\det(A) = \sum_{\sigma \in S_n}\mathrm{sgn}(\sigma)a_{\sigma(1)1}...a_{\sigma(n)n}.$$
I understand that there is a shorter proof of the Laplace expansion of a determinant using the definition of a determinant in question. However, a book I consulted has a serious gap in the proof. I will state what I want to prove:
Let $A = (a_{ij})$ be an $n\times n$ matrix with entries in a commutative ring $R$. Denote by $A_{ij}$ the $(n-1)\times(n-1)$ matrix obtained from $A$ by deleting the $i$-th row and the $j$-th column of $A$. Then, for all $i$, $$\det(A) = \sum_{j = 1}^n (-1)^{i + j} a_{ij}\det(A_{ij}).$$
$\newcommand{\bw}{\bigwedge} \newcommand{\w}{\wedge}$ There is a simple proof (although tedious if we write down all the computations, which is what I did), but for that we need to interpret the quantity $\det(A_{ij})$ in terms of maps.
My notations will be the following : $\iota_k : R^{n-1}\to R^n$ is the map that sends basis vectors to basis vectors in linear order, but not touching the $k$th one in $R^n$, $p_k : R^n\to R^{n-1}$ will be the map that forgets about the $k$th coordinate, $\rho_k = \iota_k\circ p_k : R^n\to R^n$ is the map that sets the $k$th coordinate to $0$, and finally $\pi_j : R^n \to R^n$ is projection onto the $j$th coordinate (that is, it's $id_{R^n} - \rho_k$)
Then, identifying matrices with linear maps, we have the following commutative square :
$\require{AMScd} \begin{CD}R^n @>A>> R^n \\ @A\iota_jAA @Vp_iVV\\ R^{n-1} @>A_{ij}>>R^{n-1}\end{CD}$
This will be important later, because it lets us interpret $\det(A_{ij})$ : indeed take $\bigwedge^{n-1}$ of this diagram and you get :
$\require{AMScd} \begin{CD}\bigwedge^{n-1}R^n @>\bigwedge^{n-1}A>> \bigwedge^{n-1}R^n \\ @A\bigwedge^{n-1}\iota_jAA @V\bigwedge^{n-1}p_iVV\\ \bigwedge^{n-1}R^{n-1} @>\det(A_{ij})>>\bigwedge^{n-1}R^{n-1}\end{CD}$
Right, now let $e_1,...,e_n$ denote the standard basis of $R^n$ (I will let $b_1,...,b_{n-1}$ denote the one of $R^{n-1}$), and take any $i$; we have
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = (-1)^i \bw^n\phi (e_i \w e_1 \w... \w \hat{e_i} \w ... \w e_n) = (-1)^i \phi(e_i) \w \bw^{n-1}\phi(e_1 \w ... \w \hat{e_i} \w ... \w e_n)$
where as usual, $\hat{e_i}$ means we omit $e_i$ from the term. Now any element of $R^n$ is a sum of its projections, which implies that $\phi = \sum_j \pi_j \circ \phi$. It follows that
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = (-1)^i \phi(e_i) \w \sum_j \bw^{n-1}(\pi_j \circ \phi)(e_1\w ... \w \hat{e_i} \w...\w e_n)$
Moreover, $(e_1,... \hat{e_i}, ..., e_n) = (\iota_i(b_1), ..., \iota_i(b_{n-1}))$ so that
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = (-1)^i \phi(e_i) \w \sum_j \bw^{n-1}(\pi_j \circ \phi)\circ \bw^{n-1}\iota_i(b_1\w...\w b_{n-1}) = (-1)^i \phi(e_i) \w \sum_j \bw^{n-1}(\pi_j \circ \phi\circ \iota_i)(b_1\w...\w b_{n-1})$
Now, note that $\phi(e_i) = \sum_k a_{ki} e_k$
Therefore $\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = \sum_{k,j} (-1)^i a_{ki} e_k\w \bw^{n-1}(\pi_j \circ \phi\circ \iota_i)(b_1\w...\w b_{n-1})$
$\pi_j$ of anything is colinear to $e_j$, therefore if $j=k$, the term in the sum is $0$. So we may remove all the $j=k$ terms. Now note that for fixed $k$, $\sum_{j\neq k} \pi_j = \rho_k$.
Therefore our sum simplifies to
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = \sum_k (-1)^i a_{ki}e_k \w \bw^{n-1}(\rho_k\circ \phi \circ \iota_i) (b_1\w...\w b_{n-1})$
We're almost there : $\rho_k = \iota_k\circ p_k$ as we mentioned earlier, so that
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = \sum_k (-1)^i a_{ki}e_k \w \bw^{n-1}\iota_k \circ \bw^{n-1}(p_k \circ \phi \circ \iota_i)(b_1\w...\w b_{n-1})$
Our interpretation above yields that $\bw^{n-1}(p_k \circ \phi \circ \iota_i)(b_1\w...\w b_{n-1}) = \det(A_{ki}) b_1\w...\w b_{n-1}$; and $\bw^{n-1}\iota_k (b_1\w...\w b_{n-1}) = e_1\w...\w \hat{e_k} \w ... \w e_n$ so that
$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = \sum_k (-1)^i a_{ki}\det(A_{ki}) e_k \w e_1\w...\w \hat{e_k} \w ... \w e_n = \sum_k (-1)^i a_{ki}\det(A_{ki}) (-1)^k e_1 \w... \w e_n$
All in all :
$$\bigwedge^n\phi(e_1\wedge ... \wedge e_n) = \sum_k (-1)^{i+k} a_{ki} \det(A_{ki}) e_1\w ... \w e_n$$
and in particular, $\det(A) = \sum_k (-1)^{i+k} a_{ki} \det(A_{ki})$
What worries me a tad is that I get $ji$ instead of your $ij$. Now of course this isn't fundamentally a problem, as $\det(A) = \det(A^T)$, so we do get your formula in the end, but that's not the one you get if you just follow through the given proof. Hopefully I didn't mix things up along the way
As a passing (not so useful) comment, you can see how this is an instance of categorification : we have a completely concrete formula in terms of elements of $R$, and we interpret it as saying something about various maps (if you look at it closely, all the equalities I wrote can be interpreted as saying something about maps) instead of elements, and then the computations become more straightforward - to get back the concrete thing you decategorify at the end.