I have a problem in the derivation of matrix. Suppose
- $A$ is some $m \times n$ matrix, with $m>n$.
- $B$ is a $m \times 1$ vector.
- $X$ is a $n \times 1$ vector.
If we define $M=(A^{T}A)^{-1}A^{T}$, then how do we write $$(AX-B)^{T}(AX-B)$$ into the form $(X-K)^{T}\Sigma(X-K)$, where $\Sigma$ is a matrix?
Here's what I would do. Directly compare the terms on expansion:
$X^{T}A^{T}AX = X^{T} \Sigma X \space\space(1)$
$X^{T}A^{T}B + B^{T}AX = X^{T} \Sigma K + K^{T} \Sigma X \space\space(2)$
$B^{T}B = K^{T} \Sigma K \space\space(3)$
From $(1)$ we get the following:
$\Sigma = A^{T}A \space\space(4)$
Plugging $(4)$ in $(2)$, we get:
$B = AK \space\space(5)$
Now, ideally we would want to use $A^{-1}$ here to get the expression for $K$, but since $m > n$, our matrix $A$ will not be invertible directly. Assuming $M$ to be the left pseudo-inverse of $A$:
$K = MB \space\space(6)$
Plug $(1)$ and $(6)$ in the second expression to verify.