Shape of design matrix in linear regression

93 Views Asked by At

For a linear regression problem (Bishop book 3.1 pp. 138-143), the basis function is calculated as such:

$$\boldsymbol{\phi}_i = (\phi_0(\mathbf{x}_N), \phi_1(\mathbf{x}_N), ..., \phi_M(\mathbf{x}_N) )^{T} $$

that is applied to a data point $\mathbf{x}_i \in \mathbb{R}^D$ and with i = {1, 2, ..., N} and N the total number of samples.

I am trying to understand why the following expression $\sum_{i=1}^{N} \boldsymbol{\phi}_i \boldsymbol{\phi}_i^{T} = \boldsymbol{\Phi}^{T} \boldsymbol{\Phi}$ is true,

with $\boldsymbol{\Phi}$ to be the design matrix which is equal to:

$$\left[\begin{array}{l} \phi_0(\mathbf{x}_1) &\phi_1(\mathbf{x}_1) & \cdots & \phi_{M-1}(\mathbf{x}_1)\\ \vdots & \vdots & \ddots & \vdots\\ \phi_0(\mathbf{x}_N) &\phi_1(\mathbf{x}_N) & \cdots & \phi_{M-1}(\mathbf{x}_N)\end{array}\right] \in \mathbb{R}^{N \times M}$$

Any intuition why this holds?

2

There are 2 best solutions below

0
On BEST ANSWER

I assume that you mean $\sum\limits_{i=1}^{N}\phi_{i}^{T}\phi_{i}=\Phi\Phi^{T}$. Let $$\phi_{i}:=\phi(x_{i})=\left(\phi_{0}(x_{i}),\phi_{1}(x_{i}),\ldots,\phi_{M-1}(x_{i})\right)^{T}$$ and

$$\Phi=\left( \begin{array}{cccc} \phi_{0}(x_{1}) & \phi_{1}(x_{1}) & \ldots & \phi_{M-1}(x_{1})\\ \phi_{0}(x_{2}) & \phi_{1}(x_{2}) & \ldots & \phi_{M-1}(x_{2})\\ \vdots & \vdots & & \vdots\\ \phi_{0}(x_{N}) & \phi_{1}(x_{N}) & \ldots & \phi_{M-1}(x_{N})\\ \end{array} \right) $$ Then the result follows simply by calculation of $\Phi^{T}\Phi$.

$$\Phi^{T}\Phi=\left( \begin{array}{cccc} \phi_{0}(x_{1}) & \phi_{0}(x_{2}) & \ldots & \phi_{0}(x_{N})\\ \phi_{1}(x_{1}) & \phi_{1}(x_{2}) & \ldots & \phi_{1}(x_{N})\\ \vdots & \vdots & & \vdots\\ \phi_{M-1}(x_{1}) & \phi_{M-1}(x_{2}) & \ldots & \phi_{M-1}(x_{N})\\ \end{array} \right) \left( \begin{array}{cccc} \phi_{0}(x_{1}) & \phi_{1}(x_{1}) & \ldots & \phi_{M-1}(x_{1})\\ \phi_{0}(x_{2}) & \phi_{1}(x_{2}) & \ldots & \phi_{M-1}(x_{2})\\ \vdots & \vdots & & \vdots\\ \phi_{0}(x_{N}) & \phi_{1}(x_{N}) & \ldots & \phi_{M-1}(x_{N})\\ \end{array} \right)$$

$$= \left( \begin{array}{cccc} \sum\limits_{i=1}^{N} \phi_{0}(x_{i})\phi_{0}(x_{i}) & \sum\limits_{i=1}^{N} \phi_{0}(x_{i})\phi_{1}(x_{i}) & \ldots & \sum\limits_{i=1}^{N} \phi_{0}(x_{i})\phi_{M-1}(x_{i})\\ \sum\limits_{i=1}^{N} \phi_{1}(x_{i})\phi_{0}(x_{i}) & \sum\limits_{i=1}^{N} \phi_{1}(x_{i})\phi_{1}(x_{i}) & \ldots & \sum\limits_{i=1}^{N} \phi_{1}(x_{i})\phi_{M-1}(x_{i})\\\vdots & \vdots & & \vdots\\ \sum\limits_{i=1}^{N} \phi_{M-1}(x_{i})\phi_{0}(x_{i}) & \sum\limits_{i=1}^{N} \phi_{M-1}(x_{i})\phi_{1}(x_{i}) & \ldots & \sum\limits_{i=1}^{N} \phi_{M-1}(x_{i})\phi_{M-1}(x_{i})\\ \end{array} \right)\\ = \sum\limits_{i=1}^{N} \left( \begin{array}{cccc} \phi_{0}(x_{i})\phi_{0}(x_{i}) & \phi_{0}(x_{i})\phi_{1}(x_{i}) & \ldots & \phi_{0}(x_{i})\phi_{M-1}(x_{i})\\ \phi_{1}(x_{i})\phi_{0}(x_{i}) & \phi_{1}(x_{i})\phi_{1}(x_{i}) & \ldots & \phi_{1}(x_{i})\phi_{M-1}(x_{i})\\ \vdots & \vdots & & \vdots\\ \phi_{M-1}(x_{i})\phi_{0}(x_{i}) & \phi_{M-1}(x_{i})\phi_{1}(x_{i}) & \ldots & \phi_{M-1}(x_{i})\phi_{M-1}(x_{i})\\ \end{array} \right)\\ = \sum\limits_{i=1}^{N}\phi_{i}^{T}\phi_{i} $$

0
On

I do not see any definition for $\phi_i$ as you wrote. enter image description here