I have read that the matrix form for the following summation
$$ Error(w) = \sum_{i=0}^{m} w^{T}x_i - y_i $$
- $w^T$ is the transpose of weights vector in linear regression
- $x_i$ is the ith input in vector x
- and $y_i$ is the ith element of vector y.
is as follows:
$$ (Xw - y)^T (Xw-y) $$
But I need to know the step-wise matrix algebra to achieve this. Is anybody can help me to understand this rewrite? It seems somehow complicated to me.
Any help is appreciated.
I am not sure what your $w$ is. Since you mentioned linear regression, let's say the line is $y=ax+b$, and the data set is $\{(x_1,y_1),...(x_n,y_n)\}$. Then the equations are:
$$\begin{pmatrix} x_1&1\\ x_2&1\\ ...\\ x_n&1 \end{pmatrix}\begin{pmatrix} a\\ b\end{pmatrix}=\begin{pmatrix} y_1\\ y_2\\ ...\\ y_n\end{pmatrix}$$
If this is your $X\vec{w}=\vec{y}$, then the error is the 2-norm of $X\vec{w}-\vec{y}$. In matrix form, the 2 norm of a vector $\vec{u}$ is $\vec{u}^T\vec{u}$, in this case then, $(X\vec{w}-\vec{y})^T(X\vec{w}-\vec{y})$. In equation form though, it is the sum of square of each component, so
$$\sum_{i=0}^{m} ((ax_i+b_i)-y_i)^2=\sum_{i=0}^{m}(\vec{X}_i^T\vec{w}-y_i)^2=\sum_{i=0}^{m}(\vec{w}^T\vec{X}_i-y_i)^2$$
where $\vec{X}_i=\begin{pmatrix} x_i\\ 1 \end{pmatrix}$.
If we weights on the squared error terms, the error becomes
$$\sum_{i=0}^{m} u_i(\vec{w}^{T}\vec{X}_i - y_i)^2$$
The matrix form is
$$(Xw-y)^TU(Xw-y)=[\begin{pmatrix} x_1&1\\ x_2&1\\ ...\\ x_n&1 \end{pmatrix}\begin{pmatrix} a\\ b\end{pmatrix}-\begin{pmatrix} y_1\\ y_2\\ ...\\ y_n\end{pmatrix}]^T \begin{pmatrix} u_1 & 0&...&0\\ 0&u_2&...&0\\ ...&...&...&...\\ 0&...&0&u_n \end{pmatrix} [\begin{pmatrix} x_1&1\\ x_2&1\\ ...\\ x_n&1 \end{pmatrix}\begin{pmatrix} a\\ b\end{pmatrix}-\begin{pmatrix} y_1\\ y_2\\ ...\\ y_n\end{pmatrix}]\\ =[(ax_1+b-y_1), ..., (ax_n+b-y_n)]\cdot \begin{pmatrix} u_1 & 0&...&0\\ 0&u_2&...&0\\ ...&...&...&...\\ 0&...&0&u_n \end{pmatrix} \begin{pmatrix} ax_1+b-y_1\\ ...\\ ax_n+b-y_n\end{pmatrix}\\ =[u_1(ax_1+b-y_1), ..., u_n(ax_n+b-y_n)]\cdot\begin{pmatrix} ax_1+b-y_1\\ ...\\ ax_n+b-y_n\end{pmatrix}\\ =\sum_{i=0}^{m} u_i(ax_i+b - y_i)^2\\ =\sum_{i=0}^{m} u_i(\vec{w}^{T}\vec{X}_i - y_i)^2.$$