Why is the joint probability of a Bayesian Network multinomial?

617 Views Asked by At

As far as I know, the multinomial can be defined as:

Given a sequence of n independent trials each having identical probabilities $p = (p_1, \ldots , p_k)$ for $k$ possible outcomes, the vector of the associated counts $X = (X_1, \ldots ,X_k)$ is said to follow a multinomial distribution and it is denoted as $Mu(n, p)$.

In the context of bayesian networks, we are interested in the joint probability distribution, say, for example, $p(X_1=A, X_2=B, \ldots, X_k=Z)$.

Bayesian Networks in R with Applications in Systems Biology, by R. Nagarajan, M. Scutari and S. Lèbre, says this is a Multinomial Distribution, Can someone explain why?


UPDATE: I post bellow the fragment where this is stated, in Bayesian Networks in R with Applications in Systems Biology, by R. Nagarajan, M. Scutari and S. Lèbre, 2013, Springer (US) (page $7$):

enter image description here

1

There are 1 best solutions below

12
On

Given a sequence of $n$ independent trials each having identical probabilities $\vec p=(p_1,\ldots,p_k)$ for $k$ possible outcomes, the vector of the associated counts $\vec X=(X_1,\ldots,X_k)$ is said to follow a multinomial distribution and it is denoted as $\mathcal{Mu}(n,\vec p)$.

In the context of bayesian networks, ...

This has nothing much to do with Bayesian Networks in particular.

Are you conflating the term multinomial with multivariate?   They are not the same word.

A multivariate distribution is any joint probability distribution of multiple random variables.   That is all.   Bayesian Networks deal with multinomial distributions since they do concern the interrelation of multiple random variables.

A multinomial distribution is a rather specific familily of multinomial distribution; one where the random variables have a particular definition, as described below.   If you do not have this set up, then you are not dealing with a multinomial distribution.


Can someone explain why this distribution is a Binomial one?

Multinomial.

The Binomial Distribution is a special case of the family: where $k=2$ (the two trial-outcomes of "fail" and "success"), then $\vec p=(1-p, p)$ and $X_1$ is the count for failures, and $X_2$ the count for successes in $n$ iid independent trials.

$$\mathsf P\big(\vec X=(n-x, x)\big)~=~ \dbinom {n}{x}(1-p)^{n-x}p^x\quad\mathbf 1_{x\in[0;n]\cap\Bbb Z}$$

Where $\binom n x$ is the count of distinct arrangements for $n$ trials consisting of $x$ successes and $n-x$ failures, and $(1-p)^{n-x}p^x$ is the probability for obtaining those results for each arrangment.

The Multinomial Distribution generalises this to a sequence of trials each resulting in exactly one from $k$ possible outcomes, and $\vec X$ is the multivariate vector of the counts for each of the $k$ outcomes occuring in $n$ such independent and identically distributed trials.

$$\mathsf P(\vec X=\vec x) ~{=~ \mathsf P(X_1=x_1, \ldots, X_k=x_k) \\ =~\binom{n}{x_1,\ldots,x_k} p_1^{x_1}\cdots p_k^{x_k}\quad\mathbf 1_{\vec x\in \Bbb N^k, \sum_{i=1}^n x_i=n}}$$

Where, $\dbinom{n}{x_1,\ldots,x_k}$ is called the multinomial coefficient, and equals $\dfrac{n!}{x_1!\cdots x_k!}$. Hence the name.