Can I compute the posterior given only conditional probabilities (between the features and between the class and features)?

84 Views Asked by At

Here is a simple version of my problem:

Given: $P(y|x_1), P(y|x_2), P(x_1|y), P(x_2|y), P(x_1|x_2), P(x_2|x_1)$

Problem: Find the posterior $P(y|x_1,x_2)$ without any priors. Can I compute the posterior assuming the features are not independent? Can I compute the posterior if they are independent?


More generally, I have a complete matrix of conditional probabilities between features $x_1,...,x_m$: [$P(x_i | x_j), \forall i,j$] as well as the conditional probabilities, $P(y|x_i) \text{ and } P(x_i|y), \forall i$. Again, I am wondering if it is possible to compute to the posterior $P(y|x_1,...,x_m)$ without the independence assumption or any priors.


For reference, I am trying to do the above but for $n$ classes $y_1,...,y_n$ given a subset of the features $x_1,...,x_m$ (ex. $x_1,x_9,x_{12}$).


UPDATE 1: If $x_1$ and $x_2$ are conditionally independent given $y$ (big assumption), then

$$P(y | x_1,x_2) = \frac{P(x_1,x_2 | y)P(y)}{P(x_2 | x_1)P(x_1)}$$ $$ = \frac{P(x_1| y)P(x_2| y)P(y)}{P(x_2 | x_1)P(x_1)}$$ $$ = \frac{P(y| x_1)P(x_2| y)}{P(x_2 | x_1)}$$

1

There are 1 best solutions below

5
On

Not generally, no. Having some underlying structure is what allows you to solve for it. See here for more info. Now, if you had other information, there are ways to still solve it. E.g.

P(y|x1, x2) = P(y, x1|x2)/P(x2)
P(y|x1, x2) * P(x1, x2) = P(x1, x2|y) P(y)

Through the summing and normalizing over the conditionals, you can work towards solving some of this, but if there's any connection between x1 and x2, you would require P(x1, x2|y) as well, and so forth for all n variables. Note that I didn't technically prove anything here, and any proof would be difficult to attain here for proving that it is impossible, so if anyone sees errors in my logic or assumptions feel free to edit this post.

For more info on this kind of computation, look at the link above or look up Bayesian Networks, where independences are modeled with DAGs, allowing for quick computation of all distributions giving a limited subsample of conditionals.