How can I recover a sequence of numbers given a corrupted version of it?

154 Views Asked by At

I have an unknown sequence of real numbers $x_i$ and a known sequence of real numbers $y_i$; $y_i$ is a corrupted version of $x_i$, i.e.,

$$y_i=x_i+n_i$$

where $n_i$ is a random number distributed according to a known probability distribution function $f(x,\boldsymbol{\theta})$, $\boldsymbol{\theta}$ is known and it is the set of the parameters of the distribution; for example when $f$ is a normal distribution $\boldsymbol{\theta}=(\mu,\sigma)$, $\mu$ is the mean and $\sigma$ is the standard deviation.

Given $y_i$, $f$, $\boldsymbol{\theta}$ is it possible to recover $x_i$?

Update:

Since the problem seems too broad (see the comments 1 and 2) I would like to add the constraint that $x_i>x_{i+1},\forall{i}$.

2

There are 2 best solutions below

1
On

How could it be ?

If there are no constraints on the $x_i$, all you know are confidence intervals of the $x_i$ around their respective $y_i$. But the exact values of the $x_i$ remain essentially unknown.

For instance, with $n$ uniformly distributed in $[-1,1]$ and $Y=\{0.5,3\}$, you can say that

  • $x_0\in[-0.5,1.5]$ and $x_1\in[2,4]$ with probability $1$.

  • $x_0\in[0.4,0.6]$ and $x_1\in[2.9,3.1]$ with probability $\frac1{100}$.

  • ...

Adding the monotonicity constraint will somewhat modify the extent of the confidence intervals, but the $x_i$'s remain as elusive.

0
On

Let us assume that $x = n$ ($n$ being unknown). Without further assumption, no way to tell if $y$ is $0.5n+0.5n$ or $0.7n+0.3n$. However, a whole field is dealing with producing acceptable estimates of $y$, given partial knowledge of properties on $x$ or $n$.

It is called signal (or immage) processing (subfields: filtering, smoothing, source separation). As the topic is too broad to be answered here, I am dropping a few current trends: Bayesian modeling, non-linear risk estimation, sparse approximmations.

For further details, have a look at StackExchange Signal Processing: SE.DSP. A related question is being asked and answered in Does time series data always contain noise?