Often the following definition of a Markov Process is given:
Suppose $\{X_{t}\}$ is a stochastic process defined on a probability space $(\Omega, \mathscr{F}, \mathbb{P})$. Then $\{X_{t}\}$ is a Markov process iff for all bounded measurable functions $f$ we have:
$$\mathbb{E}[f(X_{t + s}) \mid \sigma\{X_{u} \vert u \leq t\}] = \mathbb{E}[f(X_{t+s}) \mid X_{t}].$$
It is often explained that this intuitively means that predicting the future is independent of the past, once we know the current value of the process. What I do not understand is why can't we just impose
$$\mathbb{E}[X_{t + s} \mid \sigma\{X_{u} \vert u \leq t\}] = \mathbb{E}[X_{t+s} \mid X_{t}],$$
which seems to be a more natural definition (assuming these expectations exist).