I am currently working on a project, in which I aim to fit a model to a series of observed data points. These data points consist of probability vectors. Each vector adds up to a value of 1.
E.g. Representation of observed probability vector:
$O_1 = \begin{pmatrix}0.5 \\ 0 \\ 0.3\\ 0.2\\0 \end{pmatrix}$
Intuitively, I thought of using the Hidden Markov Model to create a model for my data, however, after searching through the web a bit and not finding anything promising, I am unsure if my observed variables can be used to learn a model in their current form.
Due to the nature of the project it is imperative that the Observed variables are not changed. Modifying the values of the observed variables by, for example, taking the highest probability value and therefore only having one emission, is therefore not an option.
If possible I would like to have the HMM reproduce the time series by outputting a distribution for each time step. The goal is then to sample from this distribution.
My questions are:
Is it possible to adapt the HMM to work with my input and return my desired output?
Is there a more viable way to solve my predicament?
(Since I have sunken a bit of time into the HMM approach already I would be happy to hear any suggestions regarding my first question!)
Research I have done so far: I have looked into HMMs, however all examples and papers I read talk about emitting a discrete, single value for each timestep, or look into emitting multiple discrete values, which are both not the solutions I am looking for. I stumbled upon Gaussian Mixture Model-Hidden Markov Models thanks to ChatGPT, however, I am unsure if this is the correct modification for my task. A quick confirmation would be of huge help as well!