I have a process with 100 possible states and independent entities going through the process. All the Entities have been observed through a span of 5 years at the end of each month. When the observation was made, the observed state was assigned to the entity for that month and recorded. An Entity can move from any of the states to any other state the next month, or it can stay on the same state depending on a probability distribution that is unknown. This can be seen in the image:
It is known that the probability distribution that defines the state to which the entity is going to move the next month is highly dependent on its previous states and maybe also the time spent on those states. For instance it has been seen that if an entity has been a long time in state 2, it is very likely it will move to state 1 very soon in the future.
The idea is to be able to predict where the entities are going to move in the future.
What algorithm would you suggest for such a simple problem? An obvious answer for me is to use Variable-order Markov models, but the complexity increases dramatically when the states and the order increases... If I want to choose an order of 3 with 100 states I will have 100.000.000 parameters to estimate, which is not computable
