Integral values before/after interpolation

92 Views Asked by At

TL;DR: is the integral of an oversampled/interpolated signal close to the integral before oversampling, or can they be arbitrarily far from each other?


Let's say we have a signal $f(x)$ sampled at $x_k = 0, 1, 2, 3, ..., N$.

Given this sampling, we have an approximation of the integral $I := \int_0^N f(x) dx$ given by:

$$I_1 = \sum_{k=0}^{N-1} f(x_k)$$ or $$J_1 = \frac12 f(x_0) + \sum_{k=1}^{N-1} f(x_k) + \frac12 f(x_N)$$.

Now let's say we do a x 10 oversampling / interpolation (linear, spline, or other) $g(y_k)$ on $M$ points $y_k = 0, 0.1, 0.2, 0.3, ..., N$.

Are there results that ensure that $I_1$ (respectively $J_1$) are close to

$$I_2 = \sum_{k=0}^{M-1} g(y_k)$$ or $$J_2 = \frac12 g(y_0) + \sum_{k=1}^{M-1} g(y_k) + \frac12 g(y_M)$$.

?

1

There are 1 best solutions below

0
On

Fast answer: No.

  1. You forgot to multiply by $\Delta x$ on $g(x)$, which is the size of each sub-interval $\Delta x = (x_{k+1}-x_{k})$.
  2. The corrected (see below) result of $I_2$ and $J_2$ are better than $I_1$ and $J_1$.

Detailled answer:

Let $f(x)$ be a function defined on $\left[a, \ b\right]$ and $I$ its integral:

$$I := \int_{a}^{b} f(x) \ dx\tag{1}\label{1}$$

This integral can be computed by dividing this integral in many others using $(n+1)$ points called $x_{k}$:

$$I = \sum_{k=0}^{n-1} \int_{x_k}^{x_{k+1}}f(x) \ dx\tag{2}\label{2}$$

With $x_0 = a$ and $x_{n}=b$

There are many methods to make the integration of each subinterval. Most of them suppose interpolating the function $f(x)$ into another function $h_{k}(x)$ on the interval $\left[x_{k}, \ x_{k+1}\right]$.

Constant interpolation:

Left point: You can say that $h_k(x) = f(x_{k})$:

$$I = \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}f(x) \ dx\approx \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}h_{k}(x) \ dx = \sum_{k=0}^{n-1} (x_{k+1}-x_{k}) f(x_{k})$$

For equally spaced points: $$I = \dfrac{b-a}{n}\sum_{k=0}^{n-1}f(x_{k})$$

Right point: You say $h_k(x)= f(x_{k+1})$

$$I = \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}f(x) \ dx\approx \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}h_{k}(x) \ dx = \sum_{k=0}^{n-1} (x_{k+1}-x_{k}) f(x_{k+1})$$ For equally spaced points: $$I = \dfrac{b-a}{n}\sum_{k=0}^{n-1}f(x_{k+1})$$ Middle point: You say $h_k(x)= f\left(\dfrac{x_k+x_{k+1}}{2}\right)$

$$I = \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}f(x) \ dx\approx \sum_{k=0}^{n-1}\int_{x_{k}}^{x_{k+1}}h_{k}(x) \ dx = \sum_{k=0}^{n-1}(x_{k+1}-x_{k}) \cdot f\left(\dfrac{x_k+x_{k+1}}{2}\right)$$ For equally spaced points: $$I = \dfrac{b-a}{n}\sum_{k=0}^{n-1}f\left(\dfrac{x_k+x_{k+1}}{2}\right)$$

Your first integral, the $I_1$ is the same as the left point when the points $x_{k}$ are equally spaced and at distance of $1$, which you can ommit the term $(x_{k+1}-x_{k})$ of the sum.

The same for $I_2$, which is left point, but when there are many more points inside the interval. Unfortunately, for this integral you cannot ommit the term $(x_{k+1}-x_{k})$ once it's equal to $0.1$, not $1$.

Linear interpolation:

Trapezoidal: Your interpolation function $h_{k}(x)$ is linear: $$h_{k}(x) = \dfrac{x_{k+1}-x}{x_{k+1}-x_{k}}\cdot f\left(x_{k}\right) + \dfrac{x-x_{k}}{x_{k+1}-x_{k}} \cdot f\left(x_{k+1}\right)$$

Then, you have the integral as $$I \approx \sum_{k=0}^{n-1} \int_{x_{k}}^{x_{k+1}} \left(\dfrac{x-x_{k}}{x_{k+1}-x_{k}}\right)\cdot f\left(x_{k}\right) + \left(\dfrac{x_{k+1}-x}{x_{k+1}-x_{k}}\right) \cdot f\left(x_{k+1}\right) \ dx$$

Expanding you get $$I \approx \sum_{k=0}^{n-1} \dfrac{x_{k+1}-x_{k}}{2}\left(f(x_{k})+f(x_{k+1})\right)$$ For equally distanced points, you get $$I \approx \dfrac{b-a}{n} \left(\dfrac{1}{2}f(x_0) + \sum_{k=1}^{n-1} f(x_{k}) + \dfrac{1}{2}f(x_{n})\right)$$

Your equation for $J_1$ is the trapezoidal rule for linear interpolation when the term $b-a = n$. But for $J_2$, the term $b-a \ne n$ and therefore the value is not good.

Higher orders: As we did for constant interpolation of $f(x)$ on each interval $\left[x_k, \ x_{k+1}\right]$, and for linear interpolation, the same can be made by using quadratic, cubic and higher orders. They are well known as Composite Simpson's:

  • Quadratic: Simpson's 1/3 rule $$\int_{x_k}^{x_{k+1}} f(x) \ dx \approx \dfrac{\Delta x}{3} \left[f(x_{k}) + 4f\left(\dfrac{x_{k}+x_{k+1}}{2}\right)+f(x_{k+1})\right]$$
  • Cubic: Simpson's 3/8 rule $$\int_{x_k}^{x_{k+1}} f(x) \ dx \approx \dfrac{\Delta x}{8} \left[f(x_{k}) + 3f\left(\dfrac{2x_{k}+x_{k+1}}{3}\right)+ 3f\left(\dfrac{x_{k}+2x_{k+1}}{3}\right)+f(x_{k+1})\right]$$

Some other question you may have in mind:

  1. Oversampling is a good idea to get a better approximation of $I$?

Usually the error is mesured like

$$ |I-K| < c \cdot \left(\max_{x\in \left(a, \ b\right)} f^{(p+1)}(x)\right)\cdot \left(\dfrac{b-a}{n}\right)^{p}$$

Which

  • $K$ is the approximation of $I$
  • $c$ is some constant
  • $p$ is the interpolation degree you used to compute $K$. For example, left point is $p=0$, linear is $p=1$, quadratic is $p=2$ and so on.
  • $f^{(p+1)}$ is the $(p+1)$-th derivative of $f$.

Then, if you increase the number $n$, then you should get an smaller error

Problem: If your function $f$ is hard to compute, you highly increase the cost.

  1. Using higher orders polynomials are better?

Not always. When you use higher orders to decrease the error-term $\left(\frac{b-a}{n}\right)^{p}$, the derivative $f^{p+1}$ becomes bigger, which can give you worse results. There's also the oscilation near the extremities, known as Runge's phenomenon

Normally, the maximum used is $p=3$, because often it's better to decrease the distance between two points $x_k$ than increasing the order.

  1. If I correct the formula $I_2$ and $J_2$, will I get them near $I_1$ and $J_1$?

$$I_1 = \sum_{k=0}^{n-1} f\left(k\right)$$ $$J_1 = \dfrac{1}{2}f(0) + \sum_{k=1}^{n-1} f\left(k\right) + \dfrac{1}{2} f(n)$$

Correcting the formula $I_2$ and $J_2$ to:

$$I_2 = \dfrac{1}{10} \sum_{k=0}^{10n-1} f\left(\dfrac{k}{10}\right)$$ $$I_2 = \dfrac{1}{10} \left[\dfrac{1}{2}f(0) + \sum_{k=1}^{10n-1} f\left(\dfrac{k}{10}\right) + \dfrac{1}{2}f(n)\right]$$

Then $I_2$ and $J_2$ are a better value for $I$ than $I_1$ and $J_1$ respectively.

If $f(x)$ is a linear, then $J_1 = J_2$ no matter which subdivision you choose.