Consider the following optimization problem
$$\min_{\mathbf{Q}} \sum\limits_{i=1}^{n}{\|{{\mathbf{b}}_{i}}-{{\mathbf{Q}}^{T}}{{\mathbf{x}}_{i}}\|^2}+\lambda \|\mathbf{Q}\|^{2}$$
where $\mathbf{b_i} $ is an $r$-dimensional vector, $\mathbf{Q}$ is an $n \times r$ matrix and $\mathbf{x_i} $ is an $n$-dimensional vector. The closed form solution is
$$\mathbf{Q}={{(\mathbf{S}{{\mathbf{S}}^{T}}+\lambda \mathbf{I})}^{-1}}\mathbf{S}{{\mathbf{B}}^{T}}$$
Why does it resemble the least squares solution so much? How to conduct that?
Given $\mathrm A \in \mathbb R^{m \times n}$ and $\mathrm B \in \mathbb R^{m \times p}$, we form a matrix equation in $\mathrm X \in \mathbb R^{n \times p}$
$$\mathrm A \mathrm X = \mathrm B$$
If we want to find the least-squares solution, then we minimize $\| \mathrm A \mathrm X - \mathrm B \|_F$. However, if we do not want the norm of $\mathrm X$ to become too large, then we minimize the following objective function
$$\begin{array}{rl} \| \mathrm A \mathrm X - \mathrm B \|_F^2 + \lambda \|\mathrm X\|_F^2 &= \mbox{tr} ((\mathrm A \mathrm X - \mathrm B)^T (\mathrm A \mathrm X - \mathrm B)) + \lambda \,\mbox{tr} (\mathrm X^T \mathrm X)\\ &= \mbox{tr} (\mathrm X^T (\mathrm A^T \mathrm A + \lambda \mathrm I_n) \mathrm X - \mathrm B^T \mathrm A \mathrm X - \mathrm X^T \mathrm A^T \mathrm B + \mathrm B^T \mathrm B)\end{array}$$
where $\lambda > 0$. Differentiating with respect to $\mathrm X$, we obtain
$$2 (\mathrm A^T \mathrm A + \lambda \mathrm I_n) \mathrm X - 2 \mathrm A^T \mathrm B$$
Finding where the derivative vanishes, we obtain the matrix equation
$$(\mathrm A^T \mathrm A + \lambda \mathrm I_n) \mathrm X = \mathrm A^T \mathrm B$$
If $\lambda > 0$, then $\mathrm A^T \mathrm A + \lambda \mathrm I_n$ is always invertible. Hence, the unique minimizer is
$$\hat{\mathrm X} := (\mathrm A^T \mathrm A + \lambda \mathrm I_n)^{-1} \mathrm A^T \mathrm B$$