Given any $m \times n$ matrix $M$, one can write $$ M = U\Sigma V^T $$ is the Singular Value Decomposition, where $U$ and $V$ are orthonormal and $\Sigma$ is a diagonal matrix.
Now, the same $M$ can be written as:
$$M = \sum_{i=1}^r u_i c_i v_i^T\,,$$ where $u_i$ is the $i$th column of $U$, $v_i$ is the $i$th column of $V$ and $c_i$ is the $i$th diagonal entry of $\Sigma$.
I don't understand why the second representation is the same as first one?
In general, how could matrix multiplication be expressed as product of columns, I have learnt that matrices are multiplied row by column. This is the only way even Profs do, so how can a matrix multiplication be expressed as only involving column vectors?
Sorry if the question is too basic, but I am having lot of trouble understanding how people are using column vectors in matrix multiplications.
First, just for clarity: $U$ is $m\times m$, $V$ is $n\times n$, and $\Sigma$ is $m\times n$.
We have, according to the first decomposition, that for any $1\leq i\leq m$ and $1\leq j\leq n$, $$ M_{i,j} = (U\Sigma V^T)_{ij} = \sum_{k=1}^n (U\Sigma)_{ik} (V^T)_{kj} = \sum_{k=1}^n (U\Sigma)_{ik} V_{jk} = \sum_{k=1}^n \sum_{\ell=1}^m U_{i\ell}\Sigma_{\ell k} V_{jk} $$ Now, since $\Sigma$ is an $m\times n$ diagonal matrix, $\Sigma_{\ell k}$ will be $0$ if $k\neq \ell$, and equal to $c_k$ otherwise. (Note also that it will only be non-zero for $k\leq r\stackrel{\rm def}{=} \min(n,m)$, since after that there is no $c_k$). Therefore, the inner sum can be simplified out, and we get $$ M_{i,j} = \sum_{k=1}^r U_{ik} c_k V_{jk} \tag{$\dagger$} $$
Now, let us look at the $(i,j)$-th entry of the other expression: since $u_k v_k^T$ is a matrix and $c_k$ a scalar, we have $$ \left(\sum_{k=1}^r u_k c_k v_k^T\right)_{i,j} = \left(\sum_{k=1}^r c_k (u_k v_k^T)\right)_{i,j} = \sum_{k=1}^r c_k (u_k v_k^T)_{i,j} $$ But what is $(u_k v_k^T)_{i,j}$? It is the product of the $i$-th entry of the vector $u_k$ and the $j$-th entry of the vector $v_k$, i.e. by definition it is $U_{ik} V_{jk}$. So overall, $$ \left(\sum_{k=1}^r u_k c_k v_k^T\right)_{i,j} = \sum_{k=1}^r c_k U_{ik} V_{jk} \tag{$\ddagger$} $$ and we get the same RHS as in $(\dagger)$, showing what we wanted.