How do basis functions work?

3k Views Asked by At

Hopefully this isn't too broad of a question.

I recently had it explained to me that the discrete Fourier transform was really a change in basis (thus the "dot product" like look of the sigma with the multiply in it), using the sine and cosine functions as the basis functions.

I understand using dot product against new basis vectors to change points from one space to another, but I have no idea how this would work for a basis which was a function instead of a constant.

Can anyone explain the intuition for how that works? I also have been unable to find the correct terms to search for online so am a bit lost there.

Thanks!

4

There are 4 best solutions below

0
On BEST ANSWER

First we need to be precise about what space of functions we're talking about. A choice where your question makes sense is $L^2([-\pi,\pi])$, which means the functions $f:[-\pi,\pi]\to\mathbb C$ satisfying $$\int_{-\pi}^\pi|f(x)|^2dx<\infty.$$ (There are also issues of measurability in specifying these functions but let's ignore that complication; see Stein's Real Analysis.) Now we can address your question about how to take the dot product of two functions. In this context we call it an inner product and it's written $$(f,g)_{L^2}=\int_{-\pi}^\pi f(x)\overline{g(x)}dx.$$ (Notice the parallel between this and the finite-dimensional Euclidean dot product.)

From here you can prove that the functions $\{e^{inx}\}_{n\in\mathbb Z}$ (these are the sines and cosines you asked about) are an orthonormal basis (see pp. 200-201 in Stein) (note that you need a topology to talk about a basis in this context, which is determined by the metric coming from the above inner product).

Now, as you know from the finite-dimensional case, you can write a function in a basis by taking the dot product: for $n\in\mathbb Z$, $$\hat f(n)=(f,e^{inx})_{L^2}=\int_{-\pi}^\pi e^{-inx}f(x)dx.$$ So $\hat f(n)$ is just the $n$th Fourier coefficient, or the Fourier transform evaluated at $n$! This hints that the Fourier transform is an isomorphism between $L^2([-\pi,\pi])$ and the $\ell^2(\mathbb Z)$, where the latter is defined as the space of square-summable sequences (you can probably guess how its inner product is defined).

Ps. If you're looking for search terms, "Hilbert space" refers to vector spaces (often of functions) which have a "dot product" behaving like the Euclidean one enough that you can use it to do calculus.

Bis ps. I realized too late that you're asking about the discrete Fourier transform but I hope this is nonetheless useful.

5
On

So there's something called the Schwartz space. This is a vector space upon which the Fourier transform is a naturally occurring isomorphism (automorphism, to be more precise). This space is ubiquitous in the study of partial differential equations for precisely this reason.

Now, I personally don't look at the Fourier transform, as it acts on this space, as a "change of basis," and my reasons for this are somewhat philosophical. I personally wouldn't look at any linear transformation on an infinite dimensional vector space as a change of basis, except in the rare instance where such a view is beneficial. In the context of partial differential equations, I don't see how it is a useful viewpoint. Further, usually one doesn't write down (typically cannot write down) a basis of such a space, so why use this description?

Now the discrete Fourier transform is a little different. The discrete Fourier transform is a linear transformation $F: \mathbb{C}^n \rightarrow \mathbb{C}^n$, and if you really wanted to, you could choose a basis of $\mathbb{C}^n$ and find a matrix representation of it. You will encounter some $e^{-2 \pi i k \frac{m}{n}}$ like entries in this matrix, and thus expanding it out using Euler's formula you'll get some sine's and cosine's. But this really isn't anything sophisticated, you won't get straight up functions as basis vectors (as I feel that you may be thinking), since $\mathbb{C}^n$ isn't even a function space. This isn't anything different from standard linear algebra, really.

0
On

Let us consider the DST (Discrete Sine Transform) which is very close to the DFT (Discrete Fourier Transform) and slightly simpler to understand:

The DST of order $N$ is defined by a matrix acting on vectors that represent discretized functions in this way:

$$\begin{pmatrix}\sin(1a)&\sin(2a)&\cdots&\sin(qa)&\cdots&\sin(na)\\ \sin(2a)&\sin(4a)&\cdots&\sin(2qa)&\cdots&\sin(2na)\\ \cdots&\cdots&\cdots&\cdots&\cdots&\cdots\\ \sin(pa)&\sin(2pa)&\cdots&\sin(pqa)&\cdots&\sin(pna)\\ \cdots&\cdots&\cdots&\cdots&\cdots&\cdots\\ \sin(na)&\sin(2na)&\cdots&\sin(nqa)&\cdots&\sin(n^2a) \end{pmatrix} \begin{pmatrix}f(h)\\f(2h)\\f(ph)\\\cdots\\\cdots\\f(nh) \end{pmatrix}=\begin{pmatrix}g(h)\\g(2h)\\g(ph)\\\cdots\\\cdots\\g(nh) \end{pmatrix}$$

with $a=\dfrac{2\pi}{n+1}$, and $h$ is the discretization step.

This matrix transforms the discretized function $f$ into the discretized function $g$.

The generic line-by-column product (line numbered $p$ by the column vector of $f$) is:

$$\sin(pa)f(h)+\sin(2pa)f(2h)+\cdots+\sin(pqa)f(qh)+\cdots+\sin(pna)f(nh)=g(ph)$$

which can be put into the form:

$$g(ph)=\sum_{q=1}^n \sin(pqa)f(qh)$$

It suffices now to "by analogy" to replace $\sum$ signs by $\int$ signs, and discrete values by continuous ones:

$$g(s)=\int_{t=0}^1 \sin(2\pi st)f(t)dt$$

Remark: the "details" concerning the replacement of a finite range of values by a continuous one may look rather arbitrary. In fact, other choices "by analogy" could be taken, in particular for dealing with infinite ranges of continuous variables. This would be necessary for the recovery of the so-called classical "continuous" sine transform. But our purpose here is only pedagogical.

These transformations methods, illustrated here by the building of a continuous transform out of a finite discrete one, constitute a very powerful "heuristical tool" : a method which helps you in the discovery (and understanding) of properties, relationships, etc.

But, let us repeat it, we work by analogy and no more than that.

Remark 1: It can be proved that the columns of the matrix of the discrete sine transform are orthogonal and have a common norm. Dividing by this norm, one has a privilegized orthonormal basis, a point that is important in the "analog" switch to Hilbert space $L^2$ (mentionned by @Stan Palasek), where so many things depend upon decompositions onto an orthogonal (or Hilbert space) basis.

Remark 2: I could have taken a different path and consider that we extend our matrix to infinity ($p,q=1,\cdots \infty$) with a convenient mathematical apparatus. This would have given a different "world"; but it is better not to mix up things that are not of the same nature.

Remark 3: You may have a look at a question I have asked some times ago about the relationship between discrete and continuous "worlds" and bridges that can be put between them: (Looking for examples of Discrete / Continuous complementary approaches).

0
On

for a basis which was a function instead of a constant

See, there is not really a dichotomy between functions and constants. A function is simply a constant of a different type than a number or a tuple. What's not constant are function values, i.e. the results of evaluating a function for some argument. Unfortunately, the notation many people use completely obscures this distinction. To make it clear: let $f : \mathbb{R} \to \mathbb{R}$ a function. Then

  • $f(x)$ is not a function. It is just an expression that depends on the free variable $x$. Thus it is not constant, and indeed does not make sense to consider as an element of a vector space.
    Lots of people will tell you that $f(x)$ is a function. They're wrong.
  • The function is just $f$, when not applied to any argument. If you have a proper definition like $$ \forall x .\ f(x) = \frac1{1+x^2} $$ (note that the universal quantor $\forall$ – which is usually not written out but assumed implicitly – makes $x$ is a bound variable) then $f$ as a whole is a perfectly good function-constant, just like $5$ or $\sqrt{2}$ or $b$ are perfectly good number-constants (if you previously defined it as $b = 17$).

As a matter of fact, you might consider vectors $v\in\mathbb{R}^3$ also as functions: $$ v : \{0,1,2\} \to \mathbb{R} $$ or, less precisely, $$ v : \mathbb{N} \to \mathbb{R}. $$ This means, $v(i)$ (or, as it's more commonly written, $v_i$) is a number, whereas $v$ as a whole is a vector.

Now, with such finite-dimensional vectors a scalar product will generally be a simple sum. $$ \langle v,w\rangle_{_{\mathbb{R}^3}} = \sum_{i=0}^2 v_i\cdot w_i $$ Such a definition is equally possible if the vectors aren't functions of a natural number but a real one; for instance, you might define $$ \langle f,g\rangle_{_\text{Bogus}} = \sum_{x\in \{1,3,\pi^2\}} f(x) \cdot g(x) $$ Only, that's not a scalar product: it's not positive definite, for instance if $f(x)$ has nozero values only for negative $x$ then $\langle f,f\rangle_{_\text{Bogus}} = 0$ even though $f\neq 0$.

Actually useful scalar products avoid this by evaluating the functions everywhere on their domain. This is not really feasible with sums, but you can do it with integrals, like in the classical $L^2$ Hilbert space: $$ \langle f,g\rangle_{_{L^2(\mathbb{R})}} = \int_{\mathbb{R}}\!\mathrm{d}x \ f(x) \cdot g(x). $$


You may object that this is still not positive definite: if you plug in a function that's zero almost everywhere but e.g. $f(0)=1$, then the $L^2$ norm yields $0$ although $f\neq0$. The elements of the $L^2$ Hilbert space are in fact not general square-integrable functions, but equivalence sets of such functions modulo differences on null sets; hence we can say $f=0$ if only it's zero almost everywhere. (Alternatively, you can just say: the $L^2$ norm is a norm for all continuous functions.)