Example of Kalman-filter

221 Views Asked by At

I am trying to understand Example 2 in the original article of Kalman. I would like to use the notion of Theorem 2.5 in my lecture notes to determine the Kalman equations. Moreover, the example shows up as exercise in the same lecture notes. Let me repeat the problem:

$x_1$ is supposed to be the position of some object, which is at time $t=0$ in the origin and moving with constant speed $x_2$. This will be captured by

$$ \begin{aligned} x_1(t+1)&=x_1(t)+x_2(t)\\ x_2(t+1)&=x_2(t) \end{aligned} $$

and the initial conditions $E[x_1^2(0)]=E[x_2(0)]=0$, $E[x_2(0)^2]=a^2>0$.

(Even though $E[x_2(0)]=0$ seems to be a bit confusing here? Possibly should read as $E[x_3(0)]=0$?)

Then, $E[x_1^2(0)]=0$ implies $x_1(0)=0$ (start from origin), $x_2(t)=x_2(0)$ for any $t$ (constant velocity) and $$x_1(t+1)= x_1(t)+x_2(t)=x_1(t-1)+x_2(t-1)+x_2(0)=\ldots=(t+1)x_2(0),$$ which coincides with the kinetic formula for constant velocity $x(t)=v_0 t$. Further,

$$ \begin{aligned} x_3(t+1)&=\phi x_3(t)+u_3(t)\\ y(t)&=x_1(t)+x_3(t), \end{aligned} $$

where $\phi$ is a constant and $E[u_3(t)]=0$ and $E[u_3^2(t)]=b^2>0$.

(Again, unfortunately no information about $E[x_3(t)]$. Is this alright so far?)

Here, one can also observe, that $x_3$ should somehow act as perturbation noise for the observation $y$. However, I cannot really tell, why you would want to call the noise $x_3$, since this would (from my feeling) suggest, that the noise is a signal itself?

(Additionally, one can observe, that for $|\phi|<1$ noises decay over time, that for $|\phi|=1$ the noises all add up and that for $|\phi|>1$ the noises get even amplified over time?)

Anyways, the example can be rewritten as

$$ \begin{aligned} x(t+1)&=\begin{pmatrix} 1 & 1 & 0\\ 0 & 1 & 0 \\ 0 & 0 & \phi \end{pmatrix} x(t) + \pmatrix{0 \\ 0 \\ u_3(t)}\\ y(t)&=\begin{pmatrix}1 & 0 & 1 \end{pmatrix} x(t) \end{aligned} $$

On page 6, Kalman states, that $u$ is a gaussian vector (assuming that $u=(u_1,u_2,u_3)^\top$), so I assume one could rewrite this (if that even makes sense) into

$$ \begin{aligned} x(t+1)&=\begin{pmatrix} 1 & 1 & 0\\ 0 & 1 & 0 \\ 0 & 0 & \phi \end{pmatrix} x(t) + \pmatrix{0 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 1}u(t)\\ y(t)&=\begin{pmatrix}1 & 0 & 1 \end{pmatrix} x(t) \end{aligned} $$

Now, my lecture notes, page 29 state, I should consider

$$ \begin{aligned} X(t)&=a_0(t)+a_1(t)X(t-1)+a_2(t)Y(t-1)+b_1(t)\varepsilon(t)+b_2(t)\xi(t)\\ Y(t)&=A_0(t)+A_1(t)X(t-1)+A_2(t)Y(t-1)+B_1(t)\varepsilon(t)+B_2(t)\xi(t) \end{aligned} $$

where typically $\varepsilon$ would be a noise of the signal $X$ and $\xi$ noise of the observation $Y$, in comparison to the first pages. Then I would like to find with Theorem 2.5 the Kalman-Bucy equations

$$ \begin{aligned} P(t)&=a_1P(t-1)a_1^\top+b_1b_1^\top+b_2b_2^\top-(a_1P(t-1)A_1^\top+b_1B_1^\top+b_2B_2^\top)\cdot\\ &\quad(A_1P(t-1)A_1^\top+B_1B_1^\top+B_2B_2^\top)^{-1}(a_1P(t-1)A_1^\top+b_1B_1^\top+b_2B_2^\top)^\top\\ \hat{X}(t)&=a_0+a_1\hat{X}(t-1)+a_2Y(t-1)+(a_1P(t-1)A_1^\top+b_1B_1^\top+b_2B_2^\top)\cdot\\ &\quad(A_1P(t-1)A_1^\top+B_1B_1^\top+B_2B_2^\top)^{-1}(Y(t)-A_0-A_1\hat{X}(t-1)-A_2Y(t-1)) \end{aligned} $$

with initial values

$$ \begin{aligned} P(0)&=Cov(X(0))-Cov(X(0),Y(0))Cov(Y(0))^{-1}Cov(X(0),Y(0))^\top\\ \hat{X}(0)&=E[X(0)]+Cov(X(0),Y(0))Cov(Y(0))^{-1}(Y(0)-E[Y(0)]) \end{aligned} $$

Now the very first problem I have with both notions is, that Kalman states, that $x_3$ should be noise. This seems to be odd in comparison to the notion of my lecture notes, where, as far as I know, noise should be represented as $\varepsilon$ or $\xi$.

However, if I just try to do the computations with the formulas of the lecture notes, the results seem to not match up with the results of the example in the Kalman's article. So unfortunately, I am confused on both ends and I am a bit lost with my approach to understand this example. I will add more details what I have done so far later, but maybe someone has already translated this example in another notation.

My attempt:

$$E[X(0)]=(E[x_1(0)],E[x_2(0)],E[x_3(0)])^\top=(0,0,E[x_3(0)])^\top$$

$$Y(0)=x_1(0)+x_3(0)=x_3(0)$$

$$E[Y(0)]=E[x_3(0)]$$

$$Cov(X(0),Y(0))=Cov(X(0),x_3(0))=(0,Cov(x_2(0),x_3(0)), Var(x_3(0))^\top$$

$$Cov(Y(0))^{-1}=Var(x_3(0))^{-1}$$

$$Cov(X(0))=\begin{pmatrix} 0 & 0 & 0 \\ 0 & a^2 & Cov(x_2(0),x_3(0)) \\ 0 & Cov(x_2(0),x_3(0)) & Var(x_3(0)) \end{pmatrix}$$

This would result in

$$\hat{X}(0)=\begin{pmatrix} 0 \\ \frac{Cov(x_2(0),x_3(0))}{Var(x_3(0))}(x_3(0)-E[x_3(0)])\\ x_3(0) \end{pmatrix}$$

$$P_0=\begin{pmatrix} 0 & 0 & 0 \\ 0 & a^2-\frac{Cov(x_2(0),x_3(0))^2}{Var(x_3(0))} & 0 \\ 0 & 0 & 0 \\ \end{pmatrix}$$

I am not sure, if this is really what you want as a initial value. I do not know, how I should computer $Cov(x_2(0),x_3(0))$ and $Var(x_3(0))$.

Now, I want to rewrite

$$ \begin{aligned} x(t+1)&=\begin{pmatrix} 1 & 1 & 0\\ 0 & 1 & 0 \\ 0 & 0 & \phi \end{pmatrix} x(t) + \pmatrix{0 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 1}u(t)\\ y(t)&=\begin{pmatrix}1 & 0 & 1 \end{pmatrix} x(t) \end{aligned} $$

into the form

$$ \begin{aligned} X(t)&=a_0(t)+a_1(t)X(t-1)+a_2(t)Y(t-1)+b_1(t)\varepsilon(t)+b_2(t)\xi(t)\\ Y(t)&=A_0(t)+A_1(t)X(t-1)+A_2(t)Y(t-1)+B_1(t)\varepsilon(t)+B_2(t)\xi(t) \end{aligned} $$

Now, there comes a bit of guess work. I suppose $u$ is the Gaussian noise $\varepsilon$, since it appears on the signal side. I find

$$a_0=a_2=b_2=0$$

$$a_1=\begin{pmatrix} 1 & 1 & 0\\ 0 & 1 & 0 \\ 0 & 0 & \phi \end{pmatrix},\quad b_1=\pmatrix{0 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 1}$$

The equation for $y$ is not yet in the form it should be. One needs to consider $y(t)$ to be dependend on previous steps?

$$ \begin{aligned} y(t+1)&=x_1(t+1)+x_3(t+1)=x_1(t)+x_2(t)+\phi x_3(t) + u_3(t)\\ &=\pmatrix{1 & 1 & \phi}x(t) + \pmatrix{0 & 0 & 1}u(t) \end{aligned} $$

yielding

$$A_0=A_2=B_2=0$$ $$A_1= \pmatrix{1 & 1 & \phi},\quad B_1=\pmatrix{0 & 0 & 1}. $$

Defining $\alpha:=a^2-\frac{Cov(x_2(0),x_3(0))^2}{Var(x_3(0))}$, I find

$$P_1=\pmatrix{\alpha &\alpha & 0 \\ \alpha &\alpha & 0 \\ 0 & 0 & 1}-\frac{1}{1+\alpha}\pmatrix{\alpha^2 &\alpha^2 &\alpha \\ \alpha^2 &\alpha^2 &\alpha \\ \alpha &\alpha & 1 }=\pmatrix{\frac{\alpha}{1+\alpha} &\frac{\alpha}{1+\alpha} &-\frac{\alpha}{1+\alpha} \\ \frac{\alpha}{1+\alpha} &\frac{\alpha}{1+\alpha} &-\frac{\alpha}{1+\alpha} \\ -\frac{\alpha}{1+\alpha} &-\frac{\alpha}{1+\alpha} &\frac{\alpha}{1+\alpha} }$$

The only thing I could compare this to is $P^*(1)$ in the Kalman article, but it is a completely different matrix. So, I am proably not on the right track here, but do neither know, if I can adjust something to make it fit to $P^*(1)$.

Any hint or help is appreciated! Thank you in advance!

1

There are 1 best solutions below

2
On

Lots of questions scattered throughout your question, but I'll try to hit them all.

(Even though E[x2(0)]=0 seems to be a bit confusing here? Possibly should read as E[x3(0)]=0?)

$E[x_2(0)]=0$ is a totally reasonably assumption. Particles are given a random velocity with expected value of zero, but they could go either direction. It is not necessarily true that $E[x_3(0)]=0$. $x_3$ is time-correlated noise following a Gauss-Markov model. It's fairly common in the estimation literature, so I'm guessing that Kalman didn't think it was worth saying that explicitly.

(Again, unfortunately no information about E[x3(t)]. Is this alright so far?)

In his example he only does the covariance updates, so he might have neglected to provide the mean value. If you were to implement the full state estimate then you would at the very least need $E[x_3(0)]$.

Here, one can also observe, that x3 should somehow act as perturbation noise for the observation y. However, I cannot really tell, why you would want to call the noise x3, since this would (from my feeling) suggest, that the noise is a signal itself?

Very good question. Basically what is happening is that there is a time-correlated noise affecting the measurements. The only way that the filter is able to account for this is by including the bias as a state to be estimated. It looks like the notes you're referencing do not include biased measurements, but it's a common topic in other estimation problems.

(Additionally, one can observe, that for |ϕ|<1 noises decay over time, that for |ϕ|=1 the noises all add up and that for |ϕ|>1 the noises get even amplified over time?)

I think I see your misunderstanding. Yes, the value that is added onto the measurement may be growing or shrinking over time, but this is just a bias in the measurement. The variance of the noise that is added to the bias stays the same. This is the Gauss-Markov model, and Kalman is using the state-augmentation strategy to deal with it. Lots of references will cover it---off the top of my head I know that "Introduction to Random Signals and Applied Kalman Filtering with Matlab Exercises" by Brown and Hwang does.

Now the very first problem I have with both notions is, that Kalman states, that x3 should be noise. This seems to be odd in comparison to the notion of my lecture notes, where, as far as I know, noise should be represented as ε or ξ.

Different sources will use different notation. I'm hoping that what I said above about the bias in the measurement will clear up some of your confusion.

Did you pick this particular example for self-study? Or is it part of a homework assignment? If you're doing it for self-study, I would suggest picking up a modern textbook and working through that instead of going through Kalman's original paper. It seems like he assumes a lot of background knowledge that may confuse people that are new to the field. If this is for a homework assignment then I'm reluctant to do the math for you, but I'm hoping that I helped clear up your conceptual questions.