I am trying to model a discrete-time MC with transition probabilities that depend on some function of parameters i.e $p_{ij} = f(X_0,X_1)$. Suppose we take a log-linear model where $p_{ij} = e^{\beta_0+\beta_1X_0+\beta_2X_1}$. Now I have observations for the state distribution q. I want to estimate $\beta_0,\beta_1,\beta_2$ which minimizes the expression,
$\sum_{t=1}^T||q^tP-q^{t+1}||_2$ where $P$ is the matrix with $p_{ij}$ entries
I have been trying to find material which deals with such parametric model for transition probabilities but haven't found anything really useful. I'd appreciate some suggestions or links to some resources about how to go about solving this. Thanks.