I'm working on trying to implement a hidden markov model to model the affect of a specific protein that can cut an RNA when the ribosome is translating the RNA slowly.
Some brief background:
The ribosome translates mRNA into protein
The ribosome can occasionally pause on the mRNA due to things such as secondary structure
There is a protein (call it protein $X$) that can cleave the mRNA when the ribosome is paused.
I'd like to build an HMM that can model this process, however It's been difficult to find references. I've seen that some people use absorbing states in continuous time HMM that model disease progression, for example here, however I'd like to do this for a discrete case. Most simply, my HMM would look like:
with the arrows modeling the transitions.
The emissions would be things like secondary structure free energy (a continuous parameter), and which codon is being translated (discrete parameter).
The "cut" state is an absorbing state that cannot be recovered from. Ideally I'd like to be able to run the same sequence through this model and sometimes become "cut" with a certain probability. In the end, I'd like this model to be able to predict whether or not a sequence will be "cut" enough to observe a significant difference between it and a sequence that is less likely to become "cut".
Any help/references would be very much appreciated. Thanks!
I don't know if these will help you or not, but it looks like you might want a bivariate distribution for your emission probabilities (one variable being the folding energy and the other parameter being the codons).
Here are some references I found quickly regarding the construction of a joint distribution for which one variable is continuous and the other discrete:
Mixture of Continuous and Discrete Random Variables
joint distribution, discrete and continuous random variables
joint probability distribution of one discrete, one continuous random variable
Density/probability function of discrete and continuous random variables
A basic doubt on joint distribution
Joint pdf of discrete and continuous random variables
This reference comes from a textbook on Bayesian data analysis, which, given your choice of a Bayesian network for a model, might be particularly helpful:
Introduction to Bayesian Statistics
Depending on the depth of your statistics background, the following two references with regards to modeling joint distributions of continuous and discrete data may be more or less helpful:
Longitudinal Data Analysis
Topics in Modelling of Clustered Data