I asked this on Cross Validated but found no answers, so i will try here. I have a table with the following variables:
level of poisonous gas (radon) in a house
type of house (With a basement or without)
a county in Minnesota where the house is located (84 of those)
level of uranium in the soil in each of the counties.
Now I am supposed to do two things:
- model this as a hierarchical bayesian network using directed acyclic graoph (DAG)
- complete the network by specifying all probability distributions.
The information I am also given are as follows:
- uranium is the source of radon
- radon comes from the ground
So from what I understand to complete the first task I need to find out what is dependent on what. So:
level of radon in the house depends on what county the house is in, whether the house has a basement (is built into the ground), and how much uranium is in the soil for a given county
level of uranium in the soil depends on what county it is in
those are the only two I am sure of. Perhaps we also have that
type of the house depends on the level of uranium in the soil and type of the house depends on what county it is in? I came up with this diagram: (Forgive my terrible skills in powerpoint)

And if this is correct (something I am not sure), how would I specify the prob. distributions? I am completely new to Bayesian stats, any sort of help is appreciated
Let's write:
Not required, but just for context: Note that $P$ is a positive random variable, so you might choose to model $P \sim $exponential$(\cdot)$. $B \sim$ Bernoulli$(\cdot)$ since it is binary. $C \sim$ Categorical since it can be one of 84 possible values, and $U$ is another positive variable, so you could use the exponential again (with a different parameter).
Now, we can use a DAG (Directed Acyclic Graph) to represent the joint distribution, i.e. the distribution of $(P, B , C ,U)$. In general, lets say we have random variables $(S_1, S_2, \dots, S_n)$ represented as a DAG, then the joint distribution would be:
$$ P(S_1, \dots, S_n) = \prod_{i=1}^n p(S_i| \text{parents}(S_i)) $$
The parents of $S_i$ are the nodes in the DAG that have an arrow pointing towards $S_i$. Now, back to the question, using the information given:
Note that there is some arbitrary choices here.. the structure of the DAG is largely up to the modeler.. In the end we have:
$$ P(C, U, B, R) = P(C) P(U|C) P(B|C) P(R|U,B) $$