How do we find the elements of this matrix?

135 Views Asked by At

I have $\boldsymbol x_i$ which is a $(p+1) \times 1$ matrix (a column vector).

Also $\boldsymbol X=(\boldsymbol x_1^T, \cdots , \boldsymbol x_n^T)^T$ is a $n \times (p+1)$ matrix (a column vector whose elements are row vectors).

Imagine I create the following matrix $\boldsymbol H =\boldsymbol{X(X^TX)^{-1}X^T}$

How do I find the $(ij)^{th}$ element of this matrix?

My book says, without any explanation, that it is very easy to see that $[\boldsymbol H]_{ij}=\boldsymbol{x_i^T(X^TX)^{-1}x_j^T}$

How is this even something easy to see? There must be an intuition behind it. This is just one example, but in general throughout the whole book, it seems very easy to find elements of such matrices. Is there a trick or intuition?

2

There are 2 best solutions below

0
On BEST ANSWER

Suppose we have an $n-$by$-m$ matrix $H$ and a column $m$-vector $v=([v]_1,[v]_2,\cdots,[v]_m)^T$. Then the matrix multiplication $Hv$ is a column $n$-vector; more precisely, if we write $H$ in terms of its column vectors $\{h_k\}$ we have

$$Hv=(h_1\, h_2\,\cdots \, h_m)\begin{pmatrix} [v]_1 \\ [v]_2 \\ \vdots \\ [v]_m \end{pmatrix}=h_1 [v]_1+h_2 [v]_2+\cdots +h_m [v]_m$$ i.e. $Hv$ is a linear combination of the columns of $H$. In particular, if $v=e_j$ we have $He_j=h_i=([H]_{1j},[H]_{2j},\cdots, [H]_{nj})^T$. If we wanted to instead get a linear combination of rows, we should multiply on the left e.g. we have $e_i^T H=([H]_{i1},[H]_{i2},\cdots,[H]_{in})$ as the $i$th row of $H$. Combining these gives $e_i^T H e_j = [H]_{ij}$ as the desired matrix element, and for the case of interest $$[H]_{ij}=e_i^T X (X^T X)^{-1} X^T e_j=x_i^T (X^T X)^{-1} x_j$$ in agreement with the text.

0
On

Recall the definition of matrix multiplication: $[\mathbf A\mathbf B]_{ij}=\sum_k[\mathbf A]_{ik}[\mathbf B]_{kj}$, which says that each element of the matrix product is the product of the corresponding row of $\mathbf A$ and column of $\mathbf B$ (compare this to $\mathbf x^T\mathbf y=\sum_kx_ky_k$). The product of the matrix $\mathbf A$ and column vector $\mathbf x$ is given by $[\mathbf A\mathbf x]_i=\sum_k[\mathbf A]_{ik}x_k$. Comparing this to the matrix product formula shows that each column of the product $\mathbf A\mathbf B$ can be seen as the product of $\mathbf A$ and the corresponding column of $\mathbf B$. Similarly, each row of the product is the product of the corresponding row of $\mathbf A$ with $\mathbf B$.

So, $[\mathbf H]_{ij}$ is the product of the $i$th row of $\mathbf X$ and the $j$th column of $(\mathbf X^T\mathbf X)^{-1}\mathbf X$. The former is by definition $\mathbf x_i^T$. As we’ve seen above, the $j$th column of $(\mathbf X^T\mathbf X)^{-1}\mathbf X$ is $(\mathbf X^T\mathbf X)^{-1}$ times the $j$th column of $\mathbf X^T$, to wit, $\mathbf x_j$. Putting this all together, $[\mathbf H]_{ij}=\mathbf x_i^T(\mathbf X^T\mathbf X)^{-1}\mathbf x_j$ as required.