What does it mean for two functions to be orthogonal?

3.9k Views Asked by At

When two finite dimensional vectors are orthogonal, i.e. perpendicular, their dot product is exactly zero, e.g. $$\mathbf{a}\cdot\mathbf{b}=a_1b_1+\cdots+a_nb_n=0.\tag{1}$$

When I studied functional analysis a long time ago, we used functions as "vectors" by way of an inner product. Consider the two functions $f(x)=\sin x$ and $g(x)=\cos x$. Functions $f$ and $g$ are said to be orthogonal over $[a,b]$ if (and only if?) $$\langle f,g\rangle=\int_a^bf(x)g(x)dx=0,\tag{2}$$ for some interval $[a,b]$. I kind of just accepted this definition, but the other day I was thinking about this...

Now, when it comes to vectors of finite dimension satisfying equation (1) the geometric interpretation of the two vectors is that they are perpendicular, e.g. imagine two $n$-dimensional lines in $\mathbb{R}^n$ and normal to each other.

Supposing (2) holds, then what does this say about $f$ and $g$ except that they are simply called orthogonal? Does (2) imply some (not necessarily geometric) relationship between functions $f$ and $g$ ?

EDIT

Didn't see these before, but related:

5

There are 5 best solutions below

0
On BEST ANSWER

Let $V$ be the space of functions spanned by $f$ and $g$; that is, the space of all linear combinations $a f + b g$ (that is, the function $x \mapsto a f(x) + b g(x)$) where $a,b$ are real numbers.

Then, assuming $f$ and $g$ are linearly independent, $V$ is a two-dimensional inner product space — you really can think of this in the same fashion as any other two-dimensional inner space. Sure, $V$ is a small slice of the whole space of functions of the appropriate type, but it's the only slice that matters when $f$ and $g$ are the only functions under consideration!

In fact, you can even decompose the original space of functions into the sum of $V$ and its orthogonal complement $V^\bot$, in much the same way you can think of $\mathbb{R}^3$ as the sum of the $xy$-plane and the $z$ axis. (just that, in the analogy, you have a complicated space of functions rather than the simple $z$-axis)

e.g. if you took some orthonormal basis $u,v$ for $V$, then every function can be written as an ordered triple $h = (h_1, h_2, h_3)$ where $h_1,h_2 \in \mathbb{R}$ and $h_3 \in V^\bot$, corresponding to the function $h = h_1u + h_2v + h_3$. Then the inner product becomes

$$ \langle h, k \rangle = h_1 k_1 + h_2 k_2 + \langle h_3, k_3 \rangle$$

4
On

Well, it means that the "signed area" of the region between the function T(x) = f(x).g(x) and the X axis is 0 (when viewed in the interval [a,b] of course).

This is the only geometric meaning I can think of.

1
On

It is very diffuclt to speak of geometry in function spaces. We are all famiiliar with geometry in 2D spaces and 3D spaces which you meet in everyday life. One might also say he understands geometry of higher dimensions such as 4,5,etc. However trying to understand geometry of infinite dimensional space is much harder. Try to think of it the other way around. Let us look for the best approximation of $f$ by functions in the subspace spanned by $g$. You've probably seen that $(f,g)g=\int f(x)g(x)dx\cdot g(x)$ is the best approximation of $f$ in the subspaced spanned by $g$. Then saying $f$ is orthogonal to $g$ means that the zero function is the best approximation. Same holds for vectors in $\mathbb{R}^n$. I hope that helps you.

2
On

One view is to take the $L^2$ distance:

$$d(f,g)^2=\frac{1}{b-a}\int_{a}^b |f(x)-g(x)|^2 \,dx$$

if $f(x)$ and $g(x)$ are orthogonal, then what this means is that for any real $\lambda$:

$$d(f,\lambda g)\geq d(f,0)$$

That is, no multiple of $g$ is "closer" to $f$ than zero is to $f$.

This explains my comment above - there is no linear multiple of $g$ that is a "better" approximation to $f$.

Of course, this somewhat begs the question, why the $L^2$ distance?

Statistically, we can think of $d(f,0)$ as the "standard deviation" away from $0$ of the function $f$. So if $f$ is an error, then no adjustment by a multiple of $g$ diminishes that error.

Again, that begs the question - what is up with the squares and square roots in measuring standard deviation?

There is certainly something much deeper going on with squares and square roots which makes it a convenient and useful measure of distance in general.

More importantly, if $f_i$ are orthogonal, it is much easier to minimize:

$$d(g,\sum \lambda_i f_i)$$

Because the dot product gives us the value: $$d\left(g,\sum\lambda_i f_i\right)^2 = \langle g,g\rangle +\sum_i \left(\lambda_i^2\langle f_i,f_i\rangle -2\lambda_i\langle g,f_i\rangle\right)$$

Note that we can pick the $\lambda_i$ independently to minimize each term, since there is no interaction between them, and a simple bit of calculus gives us: $$\lambda_i=\frac{\langle<g,f_i\rangle}{\langle f_i,f_i\rangle}$$ which is why we prefer orthogonal functions with $\langle f_i,f_i\rangle =1$.

It should be noted that Fourier Series shows that there is a relationship between $L^2$ and a kind of infinite version of Euclidean space, $\ell^2$ where the vectors have infinite components, $\mathbf a=(a_i)_{i=0}^\infty$ and the norm is $\|\mathbf a\|=\sqrt{\sum a_i^2}$. To make this make sense, we have to include in $\ell^2$ only "vectors" where that distance makes sense. In this vector space $\ell^2$ there is a natural dot product that is "inherited" from the finite case:

$$\langle \mathbf a,\mathbf b\rangle = \sum_{i=0}^\infty a_ib_i$$

0
On

Take a partition of $[a,b]$ into $n$ equally-spaced subintervals. Approximate a function by assigning to each subinterval its value at the middle of that subinterval, resulting in a vector in $\mathbb{R}^n$.

To check whether two functions are orthogonal, you simply take their inner product in $\mathbb{R}^n$. That is, you multiply the functions on the subintervals and then sum the products.

When you let $n\to \infty$, the inner product becomes the standard $L^2$ inner product. So I think of orthogonality in terms of the algebraic inner product on $\Bbb{R}^n$, as saying that if we take an $\epsilon$-net, the resulting $\Bbb{R}^n$ inner product will be controlled by $\epsilon$.