I am trying to understand a step taken in the derivation of convolution as a linear system applied to a signal.
The derivation goes as follows:
- A signal $f$ can be decomposed into a sum of impulse functions (aka the sifting property): $f(t) = \int f(\tau)\delta(t - \tau)d\tau$
- By way of linearity, the linear operator $L$ applied to signal $f$ is defined as $L\{f(t)\} = L\{\int f(\tau)\delta(t - \tau)d\tau\} = \int f(\tau)L\{\delta(t - \tau)\}d\tau$
- Define the impulse response $h$ as $h(t) = L\{\delta(t)\}$
- Substitute h into step 2, and out comes the definition of convolution: $L\{f(t)\} = \int f(\tau)h(t - \tau)d\tau = (f \circledast h)(t)$
From past experience, I know there exist impulse responses (aka kernels) whose range contains an arbitrary number of elements (e.g. a gaussian).
I can't reconcile this fact with the definition of $h$ via the dirac function. How is it possible that $h(n) \neq h(m)$ s.t. $m \neq 0$, $n \neq 0$, $m \neq n$? Since $h(t) = L\{\delta(t)\}$, wouldn't $h(m) = L\{\delta(m)\} = L\{0\}$ and $h(n) = L\{\delta(n)\} = L\{0\}$. Thus, both $h(m)$ and $h(n)$ are equal to $L\{0\}$? Based on how impulse functions have been defined, it seems to me like an impulse function should only ever be able to map to at most two unique values: $h(t=0)$ and $h(t\neq0)$
Summarizing the answer from the comments..
My mistake was thinking that $L\{\delta(t)\}$ meant:
in other words, perform function application of the direct to obtain a scalar, and then use that scalar to perform functional application of the linear system.
In reality, what is actually meant is that $L$ is linear map from one infinite-dimensional vector to another infinite-dimensional vector, and $t$ picks out a coordinate from the resulting vector that $L$ mapped to.