Exact probability of random graph being connected

10.8k Views Asked by At

The problem: I'm trying to find the probability of a random undirected graph being connected. I'm using the model $G(n,p)$, where there are at most $n(n-1) \over 2$ edges (no self-loops or duplicate edges) and each edge has a probability $p$ of existing. I found a simple formula online where $f(n)$ is the probability of $G(n,p)$ being connected. But apparently it's too trivial for the writer to explain the formula (it was just stated briefly).

The desired formula: $f(n) = 1-\sum\limits_{i=1}^{n-1}f(i){n-1 \choose i-1}(1-p)^{i(n-i)}$

My method is: Consider any vertex $v$. Then there's a probability ${n-1 \choose i}p^i(1-p)^{n-1-i}$ that it will have $i$ neighbours. After connecting $v$ to these $i$ neighbours, we contract them ($v$ and its neighbours) into a single connected component, so we are left with the problem of $n-i$ vertices (the connected component plus $n-i-1$ other "normal" vertices.

Except that now the probability of the vertex representing connected component being connected to any other vertex is $1-(1-p)^i$. So I introduced another parameter $s$ into the formula, giving us:

$g(n,s)=\sum\limits_{i=1}^{n-1}g(n-i,i){n-1 \choose i}q^i(1-q)^{n-1-i}$,

where $q=1-(1-p)^s$. Then $f(n)=g(n,1)$. But this is nowhere as simple as the mentioned formula, as it has an additional parameter...

Can someone explain how the formula $f(n)$ is obtained? Thanks!

2

There are 2 best solutions below

0
On

Here is one way to view the formula. First, note that it suffices to show that $$ \sum_{i=1}^n {n-1 \choose i-1} f(i) (1-p)^{i(n-i)} = 1, $$ since ${n-1 \choose n-1}f(n)(1-p)^{n(n-n)}=f(n).$ Let $v_1, v_2, \ldots, v_n$ be $n$ vertices. Trivially, $$ 1= \sum_{i=1}^n P(v_1 \text{ is in a component of size }i). $$ Now let's consider why $$P(v_1 \text{ is in a component of size }i)={n-1\choose i-1}P(G(i,p) \text{ is connected}) (1-p)^{i(n-i)}.$$ If $\mathcal{A}$ is the set of all $i-1$ subsets of $\{v_2, v_3, \ldots, v_n\}$, then \begin{align*} P(v_1 \text{ in comp of size }i)&=\sum_{A \in \mathcal{A}}P(\text{the comp of }v_1\text{ is }\{v_1\}\cup A)\\&=\sum_{A \in \mathcal{A}}P(\{v_1\}\cup A \text{ is conn'd})P(\text{no edge btwn }A\cup\{v_1\} \& \ (A\cup \{v_1\})^c). \end{align*} This last equality is due to the fact that the edges within $\{v_1\}\cup A$ are independent of the edges from $A$ to $A^c$. There are precisely ${n-1 \choose i-1}$ elements in $\mathcal{A}$. The probability that $\{v_1\}\cup A$ is connected is equal to the probability that $G(1+|A|,p)$ is connected, which is $f(i)$. There are $i(n-i)$ missing edges from $\{v_1\}\cup A$ to $(\{v_1\}\cup A)^c$.

1
On

What happens is that he/she sums over all cuts of the graph. To avoid counting anything twice, you focus specifically on some specific vertex, calling it $1$.

If the graph is disconnected, there is going to be a connected component of some size, which contains $1$.

So you sum over the size this component might have, and with multiplicity $n-1 \choose i-1$ which is the number of ways to choose $1$'s fellow set members. You want the component containing $1$ to be connected, so you multiply by $f(i)$, but nothing in the component can be connected to something not there, so you multiply by $(1-p)^{i(n-i)}$. The $(n-i)$ part of the graph can be connected or not, it doesn't matter.

A slight problem with the formula is, that it is not stable for small $p$s and large $n$s.