Counting $P_k$ on random graphs

93 Views Asked by At

Let $n \in \mathbf{N}$ and $p \in [0,1]$. I want to compute the expected value of a random variable that counts the number of copies of $P_k$ in a random graph for some given $k \in \mathbf{N}$. Unfortunately I dont know how to count those, since I am very bad at combinatorics. I think for $k=2$ one could approach this as follows. Let $a,b,c$ be three vertices. Then there are $6$ paths of length $2$ that I can construct from those, namely $a-b-c, a-c-b,b-a-c,b-c-a,c-a-b,c-b-a.$ Since all of those have equal probability $p^2$, one should have for $$X_{a,b,c}:=\begin{cases} 1 & a,b,c \ \text{form a path of length $2$},\\ 0 & \text{otherwise}. \end{cases}$$ Then $X:=\sum_{a,b,c \in V} X_{a,b,c}$ counts the number of $P_2$ in a random graph, giving us $$E(X)=\sum_{a,b,c} E(X_{a,b,c})=\sum_{a,b,c}P(X_{a,b,c}=1)=\binom{n}{3}6p^2.$$ Is this correct? Would the number of $P_k$ be something like $(k+1)!$ and thus, assuming that $X$ denotes the case for $P_k$ $$E(X)=\binom{n}{k+1}(k+1)!p^k?$$ I am not sure if I need to also account for the possible edges that might occur this way: One can have $a-b-c$ and $a-b-c-a$, both containing a $P_2$ but being different. The second one would not be included in my calculation, does it have to be included? This would make things way more complicated, especially for abritrary $k$.

1

There are 1 best solutions below

3
On BEST ANSWER

If we replace the $3$ by $k+1$ to account for choosing $k+1$ vertices, then the answer $$\mathbb E[X] = \binom{n}{k+1} (k+1)! p^k$$ is one possible correct answer.

Why do I say "one possible correct answer"? Because there are two ways we can think about counting $P_k$.

In your work, you are considering a path like $a - b - c - d$ to be distinct from $d - c - b - a$: you are counting a path and its reverse separately. It is more common to consider these a single copy of $P_3$, since they have the same edges and always appear at the same time. If we take this into account, we should change $(k+1)!$ into $\frac12 (k+1)!$.

For many purposes, this difference does not matter (in particular, if you want to determine the threshold at which $P_k$ appears in the random graph, you will not care). You might need to be careful about it later if you want to know the number of copies of a graph at its threshold: for example, the number of copies of $P_k$ in $G_{n,p}$ when $p = c \cdot n^{-(k+1)/k}$. Here, the expected value is constant, and so we want to know the limiting distribution of the answer - and the factor of $2$ is important.

(It turns out that if we consider two copies of a graph $H$ to be the same if they have the same edge set, then the limiting distribution is often Poisson. If we count all $v(H)!$ possible rearrangements of the vertices of $H$, the limiting distribution will be the Poisson distribution, but multiplied by some constant factor, which is less nice. This is why we often make the first choice, which leads us to use $\frac12(k+1)!$ in this problem.)


You also ask:

I am not sure if I need to also account for the possible edges that might occur this way: One can have $a−b−c$ and $a−b−c−a$, both containing a $P_2$ but being different. The second one would not be included in my calculation, does it have to be included? This would make things way more complicated, especially for arbitrary $k$.

For counting copies of $P_k$, we do not have to worry about other edges between the vertices. There are $\binom{n}{k+1} (k+1)!$, or possibly $\frac12 \binom{n}{k+1}(k+1)!$, ways to choose a copy of $P_k$ in $K_n$. Each of these appears in $G_{n,p}$ with exactly probability $p^k$: this is true whether or not any other edges between the same vertices are present.

There are two cases where this is different:

  • Suppose we want to count induced copies of $P_k$ in $G_{n,p}$. In this case, everything starts out the same, but for a fixed copy of $P_k$ in $K_n$, the probability that it turns out to be an induced copy of $P_k$ in $G_{n,p}$ is $p^k (1-p)^{\binom{k+1}{2}-k}$. We ask for no other edges between these $k+1$ vertices to exist.
  • Suppose we want to count the number of sets of $k+1$ vertices that form at least one $P_k$. Then there are $\binom{n}{k+1}$ sets of vertices to consider, and here we do get the terrifyingly complicated calculation you fear: we want to multiply this by the probability that $G_{k+1,p}$ contains a copy of $P_k$. For example, when $k=2$, this probability is $3p^2(1-p) + p^3$: we get a copy of $P_2$ if there are at least $2$ edges between the $3$ vertices. It gets worse as $k$ gets larger; I don't see a clean or general solution.