Derivatives of Gaussian in deriving Kalman filter

123 Views Asked by At

Reading Probabilistic robotics by Thrun et al, and in chapter 3 the derivation of a Kalman filter describes in two places setting the first derivative of the quadratic to 0 to find the mean. And that the second derivative of course represents curvature but that the inverse is the covariance. I don't understand these steps as shown below, wondering if anyone had insight! My only thoughts were perhaps it's do with Taylor series approximation of the quadratic.

The referenced derivation can be seen in this draft of the book on page 40

PROBABILISTIC ROBOTICS https://www.google.co.uk/url?sa=t&source=web&rct=j&url=https://docs.ufpr.br/~danielsantos/ProbabilisticRobotics.pdf&ved=2ahUKEwjslO7mwerbAhWHYlAKHU8EB_gQFjALegQIBRAB&usg=AOvVaw3vzb7redJATM_KDKSnsHvQ

Much appreciated.

1

There are 1 best solutions below

1
On BEST ANSWER

I think the goal is to expand $L_t$ into a summation of two functions: \begin{equation} L_t = L_t(x_t,x_{t-1}) + L_t(x_t) \end{equation}

(as in Equation 3.11), so that when exponentiated you can take the $\exp[-L_t(x_t)]$ out of the integral as a constant (Equation 3.12).

One of the goals is for the integral of $\exp[-L_t(x_t,x_{t-1})]$ to not depend on $x_t$, which means that with respect to the belief distribution on $x_t$, which means that the integral $\int_{x_{t-1}}\exp[-L_t(x_t,x_{t-1})]dx_{t-1} = \alpha$ is just wrapped up into the normalizing constant they denote by $\eta$. This is Equation 3.13.

Gaussians are symmetric, so if you can construct $L_t(x_t,x_{t-1})$ as a Gaussian distribution (or distribution kernel) on $x_{t-1}$ then the integral is not going to depend on $x_{t-1}$, but will instead only depend on the covariance.

The exponentiated term is also quadratic in $x_{t-1}$, thus it's going to be a Gaussian in $x_{t-1}$, and because of this you can find the mean by maximizing the exponentiated term. This is because the mean of a Gaussian is also it's mode due to symmetry. That is the point of Equations $3.16$ and $3.17$; to find the mean of the Gaussian on $x_{t-1}$.

The second derivative gives you the covariance just through properties of quadratic functions, as an example, suppose you had the following simple quadratic:

\begin{equation} \begin{split} f(x) & = (x-\mu)^T\Sigma(x-\mu)\\ & = x^T\Sigma x - 2\mu^\Sigma Tx + \mu^T\mu \end{split} \end{equation}

Then you can see that the second derivative with respect to $x$ is going to be $2\Sigma$. Thus the second derivative with respect to $x_{t-1}$ of the more complicated quadratic given in Equation 3.10 is also going to be the covariance of the Gaussian distribution on $x_{t-1}$.

When you utilize Equation 3.15 and 3.17 together you get a quadratic in the form required by the Gaussian kernel in Equation 3.18. This form is convenient in that Gaussians kernels don't depend on the variable of integration. This gives that $L_t(x_t,x_{t-1})$ as specified in Equation 3.18 satisfies the requirements outlined earlier.