I do not understand why is there an increase in parameters when moving from first to second order markov model
For example considering a feature space of (a - z)
For first order markov model, the parameters would be (a - z)
However, for second order (a - zz)
However, i would assume the number of combinations to be the same?
Not sure if this is the right place to ask this question. Basically what i am asking is why is there an increase in parameters if the order level increases
If you have a state space $\mathcal{X}$ of cardinality $N$, then a first-order Markov chain requires $N$ probability mass functions with with $N$ mass points each. Equivalently, this first-order Markov chain can be described by a $N\times N$ transition probability matrix. Hence, there are $N(N-1)$ parameters (the probability mass function sums to one).
If you have a second-order Markov chain, then there are $N^2$ possible state tuples for the past state on which the probability mass functions are conditioned. Hence, there are $N^2$ probability mass functions with $N$ mass points each. Hence, there are $N^2(N-1)$ parameters.