Best way to express the probability of a graph

203 Views Asked by At

Given a graph $G=(v, \varepsilon)$, where $v$ is the set of vertices and $\varepsilon$ is the set of edges of the graph, I would like to write the probability $p(G)$ of $G\in\mathcal{G}$, where $\mathcal{G}$ is a family of graphs which have 5 possible types of node data $\{A, B, C, D, \text{null}\}$ and 4 possible types of edges $\{I, II, III, \text{null}\}$. Under the assumption of independence, What would be the best way to express $p(G)$? My understanding is that the probability of a graph is the joint distribution of the probability of each of it's nodes and edges. Is that correct?

My attempt: I define $f$ as a vector of dimension $N=|v|$ and $W$ as a matrix of dimension $N\times N$. So far, I can write

$$p(G)=p(f, W)$$

Then, for each element in $f$, I need to define a probability vector of different classes. So I define $\tilde{f}$ as a matrix of size $N\times 5$ and $\tilde{W}$ as a tensor of size $N\times N\times 4$. Any suggestions for best mathematical expression for $p(f, W)$? Would $p(f_{i, 1\leq i\leq N}, W_{i,j, 1\leq i,j\leq N})$ be a valid illustration?

1

There are 1 best solutions below

0
On

I believe your question is best related to probabilistic graphical models. There are two major kinds of graphical models: the directed graphical models are called Bayesian Networks, and the undirected graphical models are called Markov Random Fields.

Just to give examples for computing probability of graphs in each of the models:

In Bayesian Networks (directed graphical models) $G=(V,E)$, any probability distribution that is consistent with a directed graph $G$ has to factor according to “node given parents”:$$P(X|G) = \prod{i=1}^d P(x_i|x_{\pi_i})$$ where $\pi_i$ are the parents of $x_i$ and $d$ is the number of nodes in the graph.

In Markov Random Fields (undirected graphical models) $G=(V,E)$, the probability for an undirected graph can be expressed as a product of clique potentials:$$p(x_V)=\frac{1}{Z}\prod_{C\in\mathcal{C}}\psi_C(x_C)$$ where $Z$ is a normalization constant, a clique $C\in\mathcal{C}\subseteq V$ is a subset of fully connected nodes, and a non-negative potential function $\psi_C(x_C)$ is associated with each clique $C$ by definition.

To give an example on the probability factorization of Markov Random Fields, for the following graphAn example of MRF,

Assuming the potential function for each clique is the same as $\psi$, the probability of the graph is:

$$p(x_V) = \frac{1}{Z}\psi(x_1,x_2)\psi(x_1,x_3)\psi(x_2,x_4)\psi(x_3,x_5)\psi(x_2,x_5,x_6)$$

For more information on this topic, consider "Probabilistic Graphical Models" from Koller, or this paper.