Basic conceptual questions about orthogonality in Linear Algebra,

525 Views Asked by At

I have the Gram-Schmidt algorithm memorized, so that I can always compute an orthonormal basis, when I need it (on pen and paper, I don't studying mathematical / scientific computing ... yet).

Could I think of the Gram-Schmidt algorithm geometrically (algebraically, I am fine with the proof, but I have no geoemtric insight from it) in the two / three-dimensional cases? How does...subtracting a bunch of inner products that scale vectors turn a linearly independent set into an orthogonal set?

Another orthogonality question I have is: I know the formula to compute the orthogonal projection of a vector $\vec v$ onto some subspace $W$ is given by:

$$P_W(\vec v) = \sum \langle \vec v, u_i\rangle u_i$$

where $u_i$ is an orthonormal basis for the subspace $W$.

But, there is another orthogonal projection that I know of:

If I look at the operator given by this matrix multiplication

$$UU^t$$

where U has orthonormal columns (is an orthogonal matrix, if it is nxn square)

then we know that $UU^t$ represents orthogonal projection of vectors onto the column space of $U$.

Are these two orthogonal projections related? Can they be used...interchangeably?

Thanks,

3

There are 3 best solutions below

13
On BEST ANSWER

The geometric interpretation of Gram Schmidt is as follows. Given a subspace $S\subseteq \Bbb R^n$, you can always decompose a vector $v\in \Bbb R^n$ as the sum of a vector parallel to $S$ and orthogonal to $S$: $$v=v_\| + v_\bot$$

enter image description here

NOTE: This is the best picture I can find -- sorry that it's not great. Hopefully you can see that the dark blue vector is being decomposed into the sum of the green vector parallel to the plane and the light blue vector orthogonal to the plane.

We can call these two vectors the projection of $v$ onto $S$ and the rejection of $v$ from $S$. Gram-Schmidt is just the process of removing the projections of each vector onto the $1$d subspaces formed by the span of each of the previous vectors.

Let's see how it works for a $3$d subspace. We're given a basis $\mathcal B=\{b_1, b_2, b_3\}$ of a $3$d subspace $S$ and we want to find an orthonormal basis $\mathcal E=\{e_1, e_2, e_3\}$ for $S$.

First we start by just setting $e_1 = \frac{1}{\|b_1\|}b_1$ -- that is we just take $e_1$ to be the unit vector in the direction of $b_1$.

Then for $e_2$, we first decompose $b_2$ into it's projection and rejection from $b_1$: $$b_2 = {b_2}_\| + {b_2}_\bot$$ Then we see that the part of $b_2$ that's orthogonal to $b_1$ is given by ${b_2}_\bot = b_2-{b_2}_\| = b_2 - \operatorname{proj}_{b_1}b_2$. This gets rid of the part of $b_2$ that's parallel to $b_1$, leaving only the part that's orthogonal to it. Then we set $e_2 = \frac{1}{\|{b_2}_\bot\|}{b_2}_\bot$.

Then for $e_3$, we follow the same procedure. We decompose $b_3$ into its parts parallel to and orthogonal to $\operatorname{span}(b_1, b_2)$: $$b_3 = {b_3}_\| + {b_3}_\bot = (\operatorname{proj}_{b_1}b_3 + \operatorname{proj}_{b_2}b_3) + {b_3}_\bot$$

Solving for ${b_3}_\bot$, we get $${b_3}_\bot = b_3 - \operatorname{proj}_{b_1}b_3 - \operatorname{proj}_{b_2}b_3$$ Then we just set $e_3=\frac{1}{\|{b_3}_\bot\|}{b_3}_\bot$.

1
On

I can give a pretty straightforward answer to the first part of your question - yes there is a visualisation in 2 and 3 dimensions.

The first part of Gram-Schmidt (GS) normalises the first vector in the set; that just means we scale it to length=1: $$\frac{v_1}{|v_1|}=u_1$$

Then we take the second vector, and take the dot product. Now, from your second question, I'm guessing you understand that the dot-product tells you `how much' one vector projects onto another. So then if you broke your second vector into components, it would be composed of a vector parallel to $u_1$ with the same length as the dot-product, and other components perpendicular to the first vector. So then considering $v_2\ -<u_1, v_2>\cdot u_1$ leaves you with something perpendicular to $u_1$ - you now have an orthogonal set.

To make it orthonormal, you just do step 1 again, scaling it so it has length=$1$.

An example helps: Consider $\{(1, 0), (2, 1)\}$ as your vectors. Clearly the first vector $v_1=(1, 0)$ is already normalised (scaled to have length=1), so we move to step 2. Using the standard basis for the plane, you can break $v_2=(2, 1)$ into two components that are perpendicular: $v_2=(2, 0)+(0, 1)$.

Now the dot product would have told you that the component pointing in the direction of $v_1$ had to be $<v_1, v_2>=2$, so by subtracting $2$ lots of $v_1$, we're left with a vector $u_2=v_1-2\cdot u_1=(0, 1)$. As this is also already normalised, you're done.

I hope this gives some insight; the generalisation to 3-D is straight-forward, you next subtract from $v_3$ the components that point in the same direction as $u_1, u_2$ and leave behind the orthogonal part. Note that you need linear independence, as otherwise $v_3$ would be in the span of $v_1, v_2$, and so subtracting these components would leave you with the zero vector - and that's not getting you any closer to an orthogonal set!

I'm afraid I can't answer your second question, but I know Paul R. Halmos has a pretty good discussion on projections and projective transforms in his book "Finite Dimensional Vector Spaces", which I can heartily recommend.

EDIT: There's an excellent visualisation on wikipedia. https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process

6
On

As far as your second questions goes, they’re in fact the same thing!

First, I hope that you see the connection between Gram-Schmidt and the formula for orthogonal projection onto a subspace. The latter is simply the sum of the projections onto the individual basis vectors of the space, which is precisely what you compute at each step of the G-S procedure.

Now, let $U$ be the matrix $\pmatrix{\mathbf u_1&\mathbf u_2&\cdots}$, i.e., the matrix with the basis vectors of the given subspace as its columns. Then $$\begin{align} \sum_i \langle\mathbf v,\mathbf u_i\rangle\mathbf u_i &= U\pmatrix{\langle\mathbf v,\mathbf u_1\rangle \\ \langle\mathbf v,\mathbf u_2\rangle \\ \vdots} \\ &= U\pmatrix{\mathbf u_1^T \\ \mathbf u_2^T \\ \vdots}\mathbf v \\ &= UU^T\mathbf v. \end{align}$$

Note that this equality only holds for the Euclidean scalar product. The matrix operation expression for orthogonal projection relative to other scalar products is a bit more complicated.

Consider, for example, the space of polynomials of degree $\le3$ with real coefficients, with inner product $\langle f,g\rangle=\int_{-1}^1f(t)g(t)\,dt$. Taking $\{1,x,x^2,x^3\}$ as the basis, you can verify that this inner product is represented by the matrix $$ S=\pmatrix{ 2&0&\frac23&0 \\ 0&\frac23&0&\frac25 \\ \frac23&0&\frac25&0 \\ 0&\frac25&0&\frac27}, $$ that is, if you represent elements of this space as vectors in $\mathbb R^4$, then $\langle f,g\rangle=g^TSf$. Relative to this inner product, an orthonormal basis for the space is $$\begin{align} u_1 &= \frac1{\sqrt2} \\ u_2 &= \frac{\sqrt6}2x \\ u_3 &= \frac{3\sqrt{10}}4x^2-\frac{\sqrt{10}}4 \\ u_4 &= \frac{5\sqrt{14}}4x^3-\frac{3\sqrt{14}}4x \end{align}$$ In the process of constructing this basis you had to find the orthogonal projection of $x^3$ onto the subspace of second-degree and less polynomials: $$ \langle u_1,x^3\rangle u_1+\langle u_2,x^3\rangle u_2+\langle u_3,x^3\rangle u_3 = \frac35x.$$ The matrix for orthogonal projection onto this subspace is given by $$\pi = UU^TS,$$ where $U$ is the matrix with columns $u_1,u_2,u_3$. The $S$ appears because of its role in the scalar product. A bit of tedious multiplication gives $$\pi=\pmatrix{1&0&0&0\\0&1&0&\frac35\\0&0&1&0\\0&0&0&0}.$$ You can see by inspection that this gives the same result for the vector $x^3$ as you got by computing the individual projections onto the unit vectors.

It turns out that you don’t even need an orthonormal basis. If the columns of $U$ are any basis for the subspace, then the orthogonal projection is given by $$ \pi=U(U^TSU)^{-1}U^TS. $$ The extra term in the middle takes care of any normalization that might be required.