I am trying to understand (intuitive explanation will be fine) why determinant is a multilinear function and therefore to learn how elementary row operation affect the determinant.
I understand that it has something to do with the definition of determinant by permutations, due to permutation being a bijection, in each product of the determinant there is just one entry from each row, but what's next?




Consider a $2\times 2$ matrix $$ A=\left[\matrix{a_{11} & a_{12}\\ a_{21} & a_{22}}\right]. $$ Using the column notations $$ A_1=\left[\matrix{a_{11}\\ a_{21}}\right],\quad A_2=\left[\matrix{a_{12}\\ a_{22}}\right] $$ we can write $$ A=[A_1\ A_2], \qquad \det A=\det[A_1\ A_2]=f(A_1,A_2)=a_{11}a_{22}- a_{21}a_{12} $$ that is the determinant is a function of the matrix columns $A_1$ and $A_2$.
Let's see now what happens when we multiply one column, say the first one, with a number $\color{red}{\lambda}$ $$ f(\color{red}{\lambda}A_1,A_2)= \det\left[\matrix{\color{red}{\lambda}a_{11} & a_{12}\\ \color{red}{\lambda}a_{21} & a_{22}}\right]=\color{red}{\lambda}a_{11}a_{22}- \color{red}{\lambda}a_{21}a_{12}=\color{red}{\lambda}(a_{11}a_{22}- a_{21}a_{12})=\color{red}{\lambda}f(A_1,A_2). $$ Thus, to multiply one column with a number is the same as to multiply the whole function with this number.
Let's see now what happens when we have addition of two columns $$ f(\color{red}{A_1'}+\color{blue}{A_1''},A_2)= \det\left[\matrix{\color{red}{a_{11}'}+\color{blue}{a_{11}''} & a_{12}\\ \color{red}{a_{21}'}+\color{blue}{a_{21}''} & a_{22}}\right]= (\color{red}{a_{11}'}+\color{blue}{a_{11}''})a_{22}- (\color{red}{a_{21}'}+\color{blue}{a_{21}''})a_{12}=\\ =\color{red}{a_{11}'}a_{22}- \color{red}{a_{21}'}a_{12}+\color{blue}{a_{11}''}a_{22}- \color{blue}{a_{21}''}a_{12}=f(\color{red}{A_1'},A_2)+f(\color{blue}{A_1''},A_2). $$ Thus to add two columns in one and then calculate the determinant is the same as to first calculate determinants for each term separately while keeping the other columns unchanged and then to add the result.
Functions with such properties are called linear, however, the determinant is not linear with respect to the entire matrix $A$, it is only linear with respect to any particular column separately. That's why it is a multilinear function of the matrix columns. Similar can be said for the rows too. A generalization to the $n\times n$ case is straightforward.