Incorrect proof: Number of edges in the unique giant component in an uniform random graph

117 Views Asked by At

I am working my way through a Random Graphs using the book "Introduction to Random Graphs" by Frieze and Karonski (which is available here: https://www.math.cmu.edu/~af1p/BOOK.pdf)

I have however stumbled into a proof which seems incorrect to me: Theorem 2.14 and proof on pages 34-38 which among other things states:

asymptotically the number of edges in the unique giant component in an uniform random graph with $n$ vertices and $m=m(n)$ edges is $(1-(x/c)^2)m$

where $c$ and $x$ are constants. The relevant part of the proof starts on line 12 on page 37: The idea is to calculate the asymptotic expected number of edges in the giant $C_0$ and then show the the random variable corresponding to this concentrates.

Mathematically, we enumarte all the edges $e_1,\dots,e_m$ and then define: $$ X = \sum_{j=1}^m 1_{\{e_j \in C_0 \}}. $$ Then I arrive, just as in the book, that $$ E[X] \approx m(1-(\frac{x}{c})^2) $$ where $\approx$ denotes that these two quantities are asymptocally equivalent (that their ratio converges to 1). Now as the book mentions, we want to show concentration using Chebyschevs Inequality: $$ P(\vert X - E[X] \vert > t) \leq \frac{V[X]}{t^2} $$ So I presume this boils down to showing that $V[X] \rightarrow 0$ for $n \to \infty$ (number of vertices). They compute that for $i\neq j$ $$ P(e_i \in C_0, e_j \in C_0)=(1+o(1))P(e_i \in C_0)P(e_j \in C_0) $$ where $o(1)$ denotes a sequence converging to 0. Hence as I see it, we get that \begin{align*} V[X] &= E[X^2]-E[X]^2 \\ &= \sum_{j=1}^m \sum_{i=1}^m P(e_i \in C_0, e_j \in C_0) - E[X]^2 \\ & =\underbrace{ \sum_{i=1}^m P(e_i \in C_0)}_{= E[X]} - \sum_{i\neq j} P(e_i \in C_0, e_j \in C_0) - E[X]^2 \\ & \approx m(1-(\frac{x}{c})^2)- (m^2-m)(1-(\frac{x}{c})^2)^2 - m^2(1-(\frac{x}{c})^2)^2 \\ & = m(1-(\frac{x}{c})^2)-m(1-(\frac{x}{c})^2)^2 \rightarrow \infty \end{align*} as $m(n) \rightarrow \infty$. This is then a too crude estimate on the concentration and one would need something else. Am I doing something wrong or is the proof in the book wrong? Does anyone know if the result is even true and know of another reference/proof? Any clarification would be much appreciated.

Note: $1-(x/c)^2$ is a real number strictly between 0 and 1.

Update: Is it possible the joint probabilities $P(e_i \in C_0, e_j \in C_0)$ are not the same for all terms in the sum?

1

There are 1 best solutions below

3
On BEST ANSWER

We do not need variance to go to $0$ to apply Chebyshev's inequality; that's asking for far too much. We have $$ \mathbb P(|X - \mathbb E(X)| \ge \delta \mathbb E(X)) \le \frac{\text{Var}(X)}{\delta^2 \mathbb E(X^2)} $$ and so, provided $\text{Var}(X) = o(\mathbb E(X^2))$, we know that $X = (1+o(1)) \mathbb E(X)$ with high probability.

For example, if we take $\delta = \left(\frac{\text{Var}(X)}{\mathbb E(X)^2}\right)^{1/3}$, then $\delta \to 0$ as $n \to \infty$, and $X$ is in the range $(1 \pm \delta) \mathbb E(X)$ with probability at least $1-\delta$.

For more details, see Chapter 20 of Frieze and Karonski's book.


For this particular proof, choose $\epsilon$ such that $\mathbb P(e_i \in C_0, e_j \in C_0) = (1+\epsilon)\mathbb P(e_i \in C_0)\mathbb P(e_j \in C_0)$ for $i \ne j$; we know that $\epsilon \to 0$ as $n \to \infty$. (Note that these probabilities are the same for every $i$ and $j$: we let $e_1, \dots, e_m$ be the edges of $G$ in random order, so any two edges of $G$ are equally likely to be $e_i$ and $e_j$.)

We have $$ \mathbb E(X^2) - \mathbb E(X)^2 = \sum_{i=1}^m \sum_{j=1}^m \mathbb P(e_i, e_j \in C_0] - \sum_{i=1}^m \sum_{j=1}^m \mathbb P(e_i \in C_0) \mathbb P(e_j \in C_0). $$ The diagonal terms where $i=j$ contribute $\mathbb E(X)$ to the first sum and some nonnegative amount $N \le \mathbb E(X)$ to the second sum. Each term where $i \ne j$ contributes $\mathbb P(e_i \in C_0, e_j \in C_0)$ to the first sum and $\mathbb P(e_i \in C_0)\mathbb P(e_j \in C_0)$ to the second sum, so their difference contributes $\epsilon \cdot \mathbb P(e_i \in C_0)\mathbb P(e_j \in C_0)$, and altogether these contribute $\epsilon \cdot \left(\mathbb E(X)^2 - N\right)$. Therefore $$ \mathbb E(X^2) - \mathbb E(X)^2 = \mathbb E(X) - N + \epsilon \cdot \left(\mathbb E(X)^2 - N\right) = \epsilon \cdot\mathbb E(X)^2 + \mathbb E(X) - (1+\epsilon) N $$ which is $o(\mathbb E(X)^2)$. That's enough to apply Chebyshev's inequality.