What is the intuition/explanation of using variances in kalman filters?
Also confusion in joint PDF considered in kalman filter. For eg, when we take a 1D kinematics, say position and velocity of a moving object in x direction, widely used example for kalman filters, both $\displaystyle {{{x}_{t}}}$ and $\displaystyle {{{{\dot{x}}}_{t}}}$ are dependent (velocity is derived from displacement over time). So if at all we take joint PDF we should introduce baysian. That is to take PDFs f($\displaystyle {{{x}_{t}}}$) and f($ \displaystyle {{{{\dot{x}}}_{t}}|{{x}_{t}}}$), but we seem to just multiply PDFs f($\displaystyle {{{x}_{t}}}$) and f($ \displaystyle {{{{\dot{x}}}_{t}}}$) which am unable to comprehend. What am I missing here?
2026-02-22 22:40:06.1771800006
Why variance in kalman?
388 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
Why (co-)variances? Because no system, no model, and no measurement is perfect. Every position measurement provides more evidence to prefer or reject various models and parameter values for those models. However, no amount of evidence can absolutely select a "correct" model or "correct" parameters. The filter is designed to partition deviations from predicted and observed measurement into the part that conforms with the model (i.e., is compatible with the predicted observables distribution from the prior time step) and the part that is "noise" (i.e., whatever disagreement with the prediction is observed). Note that measurement noise, unconverged model parameters, unmodeled influences, and mismatch between model and system all contribute to the "noise".
You don't have independent (marginal) distributions of $x_t$ and $\dot{x}_t$. You have a joint PDF of $(x_t, \dot{x}_t)$ pairs. When you use this joint PDF to predict the distribution for the next time step, you introduce correlation (and covariance) because, for each choice of $x_t$, the image of that vertical slice of the PDF is angled -- the parts corresponding to larger $\dot{x}_t$ are shifted further to the right and those to lesser $\dot{x}_t$ are shifted further to the left. At each prediction, the independent variances get mixed into the covariance. This is illustrated and explained in different words (and with equations) at [B]. Also, at each new prediction, the effect of the likelihood of more extreme parts of the previous PDFs that are incompatible with the new observation are suppressed in the update. (That is, if the observation keeps being a little to the right for several updates, both the left tail of the position and the "to the left" tail of the velocity are suppressed, tending to push the prediction to "catch up".)
It is not that $x_t$ and $\dot{x}_t$ are dependent -- they are not; one can easily imagine prior histories giving any pair of current positions and velocities. (Note that specific models may implement a constraint causing a dependency, but your question doesn't indicate that you are thinking of model constraints.) However, $x_{t+1}$ is strongly dependent on both $x_t$ and $\dot{x}_t$. In addition $\dot{x}_{t+1}$ is weakly dependent on $\dot{x}_t$ since infinite acceleration is very expensive. Since one has only a simultaneous distribution of $(x_t, \dot{x}_t)$ pairs, one pushes that distribution forward in time, subject to model constraints, to produce a prediction of the next state. Then the system is observed and that new information is used to update the predicted distribution. The Kalman filter does not bother to represent any possible joint distribution, it uses a simple join Gaussian, so only needs to track the mean and the covariance matrix.
[B] Babb, Tim, "How a Kalman filter works, in pictures", http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/ .