The Expectation and Variance of the number of $k$ size sets containing exactly $m$ edges in $G(n, p)$

118 Views Asked by At

Let $X$ be the number of sets of size $k$ containing exactly $m$ edges in $G(n,p)$. Find $EX$ and $Var X$.

My Attempt

Given some indicators $X_1, \dots, X_{\binom{n}{k}}$ we have that $X = \sum X_i$. Where, $$X_i = P(X_i = m \hspace{0.2cm} edges) = \binom{\binom{k}{2}}{m} p^m (1-p)^{\binom{k}{2} -m}$$ $$EX = \sum E X_i = \binom{n}{k} \binom{\binom{k}{2}}{m} p^m (1-p)^{\binom{k}{2} -m}$$ To find the Variance I need to apply $$Var X = EX + \sum_{i \neq j} EX_i X_j - (EX)^2$$ But I am not quite sure on how to compute $\sum_{i \neq j} EX_i X_j $. Any help regarding computing and solving the variance is much appreciated.

1

There are 1 best solutions below

1
On BEST ANSWER

Calculating $\sum_{i \neq j} EX_i X_j$ requires estimating the probability that a pair of distinct $k$-sets in $V(G)$ both contain exactly $m$ edges. So, let $S_i, S_j \subseteq V(G)$ be distinct sets of size $k$. To calculate the probability that $S_i$ and $S_j$ each contain exactly $m$ edges, we need two pieces of information:

  1. How large is $S_i \cap S_j$?
  2. How many edges does $S_i \cap S_j$ contain?

First, let $t = |S_i \cap S_j|$. Our assumption on $S_i$ and $S_j$ means that $0 \leq t \leq k - 1$. Second, let $a$ be the number of edges that $S_i \cap S_j$ contains. (Observe that if $S_i$ and $S_j$ each contain exactly $m$ edges, then $S_i \setminus S_j$ and $S_j \setminus S_i$ must each contain exactly $m - a$ edges.) It is not hard to show that $$\operatorname{max}\biggl(0, m - \binom{k-t}{2}\biggr) \leq a \leq \operatorname{min}\biggl(m, \binom{t}{2}\biggr).$$ To make the summation below less ugly, let $a_{\ell}(m, k, t) = \operatorname{max}\bigl(0, m - \binom{k-t}{2}\bigr)$ and let $a_u(m, t) = \operatorname{min}\bigl(m, \binom{t}{2}\bigr)$.

Then, because there are a total of $2\binom{k}{2} - \binom{t}{2}$ potential edges between vertices of $S_i \cup S_j$, we have $$\sum_{i \neq j} EX_i X_j = \sum_{t=0}^{k-1}\sum_{a = a_{\ell}(m, k, t)}^{a_u(m, t)} \binom{\binom{k-t}{2}}{m-a}^2\binom{\binom{t}{2}}{a} p^{2m - a} (1-p)^{2\binom{k}{2} - \binom{t}{2} - 2m + a}.$$

The asymptotic value of the variance will of course depend on $p$, $m$, and $k$.