I am trying to understand Markov processes but am still confused by their definition. In particular, the Wikipedia page gives this example of a non-Markov process. The example is of pulling different coins out of a bag at random, and the article seems to imply that any time information is required about previous states to determine the next state, it's not a Markov process.
But isn't that what higher-order Markov chains are for? In the Wikipedia article, couldn't you have represented the process with an n-order Markov chain where n is the number of coins in the bag? (I feel sure I am wrong here but I can't see exactly how).
Disclaimer: Apologies if this question is below the normal quality - it's because I study computer science, not mathematics.
EDIT
Just clarifying what my confusion is. Why couldn't we represent the coins being removed from a bag with states such as:
- initial states {coin1, coin2, ..., coinN}
- states at t=1 {coin1&coin2, coin1&coin3, ..., coin(N-1)&coin(N)}
...where each state represents the coins that have been chosen so far? Probabilities could be assigned to these states that would reflect the previous states, but you still would only need to know the current state in order to predict the next one. So why isn't drawing coins from a bag a Markov process?
I think the idea of somehow redefining the state space is dubious. If we use the Wiki definition, or that in most texts, a discrete stochastic process is simply a sequence X_1, X_2, X_3, ... of random variables, and this particular stochastic process is said to satisfy the Markov property provided the probability that X_n = x_n depends only on the value of X_(n-1) [for the "homogeneous case"] or else on both the value of n and the value of X_(n-1) [for the "nonhomogeneous case"].
In other words, I think that to refer to a specific stochastic process and then ask if it has a certain property involves sticking to the specific sequence X_n of random variables while determining whether the stochastic process has the property, in this case the Markov property. We cannot change our sequence X_n and say "the process is Markov after all" or the like. Of course we may define another sequence of random variables Y_n = F(X_1, ..., X_n) and check if this Y_n is Markov. But that seems unrelated to whether the original sequence X_1, ... is Markov or not, at least to me.
Here's a variation on the coins out of bag example to have the full set of positive integers as the "time set". Re-set the bag to hold initially 1 quarter, 6 pennies, and 6 nickels. We progressively draw coins from the bag without replacing them, and let X_n be the sum so far drawn. There being initially 13 coins in the bag, we can say with probability 1 that X_13 = 56 cents; we simply DEFINE X_14, X_15, ... to all be 56 cents with probability 1.
Now look at X_6. There are two ways to have X_6 = 30 cents, which come from using one quarter and 5 pennies, or else using 6 nickels. In the first case the remaining coins in the bag are 6 nickels and 1 penny, and in the second case the remaining coins in the bag are 1 quarter and 6 pennies. So certainly one can say that knowing that X_6 = 30 cents, even including knowing that the "time" is 6, i.e. that this is the sixth draw, does not help in finding the probability that X_7 = 31 cents.
On the other hand, knowing the values of X_1 through X_6 inclusive, we can certainly find the probability that X_7 = 31.