I am reading a book on Bayesian filtering and I have a question regarding calculating transition density $p(X_t|X_{t-1})$. My question is how the term $p(X_t|X_{t-1}, V_{t}=v)$ is converted to the Dirac delta function $\delta(X_t - f(X_{t-1}, v))$ ? Please see the image below.
Another question is, if we have a discrete-time nonlinear state equation with additive noise: $X_t = f(X_{t-1}) + V_t$, now, how we can compute the following PDFs:
1- $P(X_t)$
2- $P(X_t|X_{t-1}, V_t)$
Suppose that the distribution of the noise $p (V_t)$ is known. Note that it is obvious that the $P(X_t|X_{t-1})$ is computed according to the total probability law.
