Which of the following factorization captures most accurately the assumptions described?

64 Views Asked by At

Suppose you want to use a Bayesian network to model your performance in an exam. Your performance on the exam will depend only on whether the exam is easy and whether you studied enough. In addition, how much time you devote to studying for this exam will depend on whether you will have another exam that week. Formally, let X1;X2;X3;X4 be binary variables, where:

X1 represents your performance in the exam (1 denotes a good score, 0 a not so good one)

X2 = 1 if you studied enough, 0 otherwise.

X3 = 1 if you will have another exam that week, and 0 otherwise.

X4 = 1 if the exam easy, and 0 otherwise.

Additionally X2 and X4 are marginally independent.

I have broken down the description into the following:

X1 has 2 parent nodes X2 and X4, while X2 has a parent node of X3. This gives me

P(X1, X2, X3, X4) = P(X1|X2,X4)P(X2|X3)P(X3)P(X4)

However, I am unsure of one part: does P(X1|X2)P(X1|X4) or P(X1|X2,X4) better describe the given context?

1

There are 1 best solutions below

1
On

Well, I'm not sure either.

Do you have data available? If so, there are two ways to find the dependencies either score-based (say using the GS or GSG algorithm) or constraint-based (using conditional independence tests).