Definition of convolution?

3.4k Views Asked by At

Why do we use $x - y$ rather than $x + y$ in the definition of the convolution? Is it just convention? (If we are thinking of convolutions as weighted averages, for instance against "good kernels," it should make no difference.)

Why $(f * g) (x) = \int f(y) g(x - y) dy$ rather than $(f * g) (x) = \int f(y) g(x + y) dy$?

Edit: I'm finding it really hard to choose a best answer. There are at least three very good ones here.

5

There are 5 best solutions below

0
On

Intuitively, and abusing the notation a bit, you can consider the convolution as

$$ (f*g)(x) = \int_{p+q=x} f(p)g(q) $$

This makes it clear that $f*g = g*f$. On the other hand with your alternative definition we would get $$ (f*'g)(x) = \int_{q-p=x} f(p)g(q) $$ and therefore $(f*'g)(x) = (g*'f)(-x)$, which is untidy for no good reason.

0
On

For one thing, we want convolution (a linear operator) to be connmutative. That holds for the traditional definition

$$(f * g) (x) = \int f(y) g(x - y) dy =_{(z=x-y)}= \int g(z) f(x-z) dz = (g * f) (x)$$

It doesn't hold with the other

$$(f * g) (x) = \int f(y) g(x + y) dy = \int g(z) f(z-x)dz$$

Other nice property we'd miss is the convolution theorem.

2
On

Consider the discrete analogue: Given two functions $a:\>k\mapsto a(k)$ and $b:\>l\mapsto b(l)$ we are collecting (i.e., summing up) for given $r$ all products $a(k)\,b(l)$ where $k+l=r$. This is the right thing to do, e.g., when multiplying two power series $$a(z):=\sum_{k=0}^\infty a_k z^k, \quad b(z):=\sum_{l=0}^\infty b_lz^l\ .$$ Then $c(z):=a(z)b(z)$ can be written as $c(z)=\sum_{r=0}^\infty c_r z^r$ with $$c_r:=\sum\nolimits_{k+l=r} a_k b_l=\sum_{l=0}^r a_{r-l}\, b_l\qquad(r\geq0)\ .$$ This is expressed by saying that the sequence $c:=(c_r)_{r\geq0}$ is the convolution of the two sequences $a:=(a_k)_{k\geq0}$ and $b:=(b_l)_{l\geq0}$, in short: $c=a*b$.

A similar argument can be put forward when dealing with the sum of two independent random variables $X$ and $Y$ having probabilities $p_k$ and $q_l$ of assuming the values $k$ and $l$, respectively.

Translating this into a continuous setting we have $$(f*g)(x)=\int_{-\infty}^\infty f(x-t)\,g(t)\ dt\ ,$$ assuming that the integral on the right hand side makes sense.

1
On

You could think of simple examples as this:

Impulse response $g(x)$ is zero except for $x=10$, $g(10) = 1$. This could mean "dog is barking 10 seconds after he has seen a cat".

Then the convolution could be explained as: The volume at which the dog is barking at time t is the amount of cats he has seen 10 seconds before time $t$. Which is $t$ minus $10$ seconds.

0
On

In addition to other useful remarks, it might be worthwhile to note that thinking in terms of representations of a topological group $G$ (on topological vector spaces $V$) shows what "convolution" must be, in the following way. For simplicity, suppose $G$ is unimodular, in the sense that left and right Haar measures are the same.

Let $G\times V\to V$ be a continuous group respresentation, so including associativity $g(hv)=(gh)v$ and that the identity of $G$ acts trivially. For a broad class of topological vector spaces $V$ (quasi-complete locally convex, including Hilbert, Banach, Frechet, LF, their weak duals...) compactly-supported continuous functions $f$ on $G$ act by integrating (e.g., Gelfand-Pettis "weak" integrals suffice) $$ f\cdot v \;=\; \int_G f(g)\;gv\;dg $$ If we characterize an operation $f*F$ by requiring $(f*f)v=f(Fv)$, we are led to to an expression (or two) for that convolution: first, $$ f(Fv) \;=\; \int_G f(g)\;g (Fv)\;dg \;=\; \int_G f(g)\,g\Big(\int_G F(h)v\;dh\Big)dg \;=\; \int\int f(g)\,F(h)\,ghv\;dh\,dg $$ There are at least two reasonable choices now: replace $g$ by $gh^{-1}$, or replace $h$ by $g^{-1}h$. In the former, we have $$ f(Fv) \;=\; \int\int f(gh^{-1})\,F(h)\;gv\;dh\,dg \;=\; \int\Big(\int f(gh^{-1})\,F(h)\,dh\Big)\,gv\;dg \;=\; \Big(g\to \int f(gh^{-1})\,F(h)\,dg\Big)\cdot v $$ which shows that $$ (f*F)(g)\;=\; \int_G f(gh^{-1})\,F(h)\;dh $$ For the real numbers, this gives the $x-y$ rather than $x+y$. But, to my mind, a larger point is that we can deduce what convolution is, rather than "guessing" a "definition" and "checking" whether or not it works as we hope.