Markovian systems: Why must controls be independent of state?

99 Views Asked by At

I am currently working my way through Probabilistic Robotics by Thrun, Burgard, and Fox. On p. 91, I encountered the following statement:

The Markovian assumption implies independence between $x_{t-1}$ and $u_t$, and thus $p(x_{t-1}|u_t) = p(x_{t-1})$.

$$x_{t-1} \ldots \text{system state at time } t-1 \\ u_t \ldots \text{control input immediately before time } t \\ p(y) \ldots \text{probability of } y $$

I thought hard, but I could not come up with a way to prove the independence between $x_{t-1}$ and $u_t$. Why, for example, is a system with a simple proportional controller $u_t = -x_{t-1}$ not Markovian?

2

There are 2 best solutions below

2
On BEST ANSWER

I did some further research and found out the following:

The statement $p(x_{t-1}|u_t) = p(x_{t-1})$ has nothing to do with the Markov assumption. It rather states that control is randomly chosen and not a function of the state.

The assumption of random controls is essential for the Bayes filter: Without this assumption, the Bayes filter algorithm does not hold (cf. p. 32).

Since the quote provided in the original post stems from the mathematical derivation of the histogram filter, which is just an implementation of the Bayes filter, it simply restates the random controls assumption of the Bayes filter.

2
On

I perused the book you mention and found no definition of what is the Markov assumption (the author does have a word on the markov assumption but its more a critique on the hypothesis then a definition I believe the definition is somewhere else in the lines of conditional independence).

Usually markov assumption is translated as $P(x_{t+1} \mid x_0,\ldots, x_{t}) =p(x_{t+1}\mid x_{t}) $ (this means that the future depends only on the present and nothing more). But in your case you are dealing with control data $u_t$ and state variable $x_t$ you have the following object:

$$p(x_t \mid x_{t-1},u_t) $$

The probability $p(x_t \mid x_{t−1} , u_t )$ is the state transition probability. It specifies how environmental state evolves over time as a function of robot controls $u_t$ .

Conditional independence then reads as

$$p(x_t \mid x_{0:t−1} , z_{1:t−1} , u_{1:t} )=p(x_t \mid x_{t−1} , u_t ) $$

this means that future and past are independent given the present. the author says that the variable $u_t$ will always correspond to the change of state in the time interval $(t − 1; t]$

therefore since $x_{t-1}$ is the state at time $t-1$ the variable $u_t$ depends on the future of $x_{t-1}$. We may say that $p(u_t \mid x_{t-1}) = p(u_t)$

this gives $$p(u_t \mid x_{t-1}) = \frac{p(u_t, x_{t-1})}{p(x_{t-1})}= p(u_t) \Rightarrow \frac{p(u_t, x_{t-1})}{p(u_t)}= p(x_{t-1}) = p(x_{t-1} \mid u_{t})$$

this is legitimate once $p(x_{t-1})>0$ and $p(u_t)>0$.