I am beginner in mathematics/statistics and apologise in advance for my faulty use of language.
I am working on a problem in statistical genetics. Given a dataset, $D$, I obtained probabilities for each of three possibles states ( $p_{AA}$, $p_{AB}$, $p_{BB}$ ), which always sum to 1, and where AA, AB, and BB denote the possible genotype states. For the same genotypes, I have used $n$ different datasets ( $D_{1}, D_{2}, ..., D_{n}$ ), which gives me $n$ different probability distributions.
Now, I would like to approximate $p_{AA}$, $p_{AB}$, $p_{BB}$ given $D_{1}, D_{2}, ..., D_{n}$, which is not straightforward, as the dimensions of each dataset are different.
Each dataset consists of a set of other genotypes, which are represented as data structures, which carry information of different type (e.g. a continuous parameter value, a vector of observed, binary states for AA, AB, BB and it's corresponding probabilities, etc.). Each dataset is different in that it contains other genotypes and unequal numbers thereof.
The method by which the probabilities were estimated using a single dataset is not relevant for me here, but I could construct an approximate model which conditions $p_{AA}$, $p_{AB}$, $p_{BB}$ on the data in some way, if I would need to specify a model for my analysis.
However, I am not sure how to approach this problem. Let's say I construct a model and apply it to each probability distribution and dataset pair, to calculate my model parameters, I do not know how to get from there to the probability distribution given $D_{1}, D_{2}, ..., D_{n}$.
As I said, I am only a beginner, and simply pointing out some possible methods would help already.