Why we use the expectation conditional on signal one period before as the estimation of state variable in Kalman filter?

46 Views Asked by At

Consider a simple state space model $$x_{t+1}=Ax_t+C\omega_t\\ y_t=Gx_t+v_t $$ Besides the orthogonal assumption for $w_{t+1}$ and $v_t$, we assume that $w_{t+1}\sim N(0,I)$ and $v_t\sim N(0,R)$. We observe $y_t$ while we want to estimate the sequence of $x_t$ using Kalman filter. During the process we will derive the following two conditional expectations: \begin{align*} \mathbb{E}[x_t|y_{t-1}]\\ \mathbb{E}[x_t|y_{t}] \end{align*} When applying Kalman filter, we use the first expression as our estimation of $x_t$. I want to know why we do not use the second conditional expectation instead. My intuition is that we start the algorithm with a prior $\hat{x}_0=\mathbb{E}[x_0]$, but $\mathbb{E}[x_0|y_0]$ clearly contains more information than the trivial prior. That is, at the beginning, we assume that \begin{align*} x_0\sim N(\mu_0,\Sigma_0) \end{align*} while it's not difficult to obtain the following result: \begin{align*} \mathbb{E}[x_0|y_0]=\mu_0+L_0(y_0-G\mu_0) \end{align*} where \begin{align*} L_0=\Sigma_0G'(G\Sigma_0G'+R)^{-1} \end{align*} Then, which one should we take as the estimation $\hat{x}_{t-1}$? $\mu_0$ (which could be regarded as $\mathbb{E}[x_0|y_{-1}]$) or $\mathbb{E}[x_0|y_0]$?

1

There are 1 best solutions below

0
On

I would like to see the referece for

When applying Kalman filter, we use the first expression as our estimation of ,

because this claim sounds incomplete. It is either wrong or mis-interpreted.

In Kalman filters, we are really aiming at $\mathbb{E}[x_t | y_{1:t}]$. This expectation is mostly termed as our estimation of $x_t$, as it encodes the best estimate of the random variable $x_t$ given all the information from data up until time $t$ (recall the definition of conditonal expectation). Here I denote $y_{1:t} = \{ y_1, y_2, \ldots, y_t\}$.

The core of Kalman filters is to recursivly compute $\mathbb{E}[x_t | y_{1:t}]$ by using its previous estimate $\mathbb{E}[x_{t-1} | y_{1:t-1}]$ for $t=1,2,\ldots$. Note that we define $\mathbb{E}[x_{0} | y_{1:0}]= \mathbb{E}[x_{0}]$.

The quantity $\mathbb{E}[x_t | y_{1:t-1}]$ is called the predictive mean which is an essential intermediate quantity to obtain the filtering distributions.

Your intuition is correct, but $\mathbb{E}[x_0 | y_0]$ is not useful, as we, in practice, don't have $y_0$. You start the Kalman filtering equations from $x_0$ (i.e., the initial condition).