Frieze and Karonski - Exercise 1.4.3 -- What is the significance of the $2$ here?

130 Views Asked by At

Exercise 1.4.3 in Frieze and Karonski's text Random Graphs is:

Suppose that $p = d/n$ where $d = O(1)$. Prove that w.h.p, in $G_{n,p}$, for all $S\subseteq [n], |S| \leq n/\log n$, we have $e(S) \leq 2|S|$, where $e(S)$ is the number of edges contained in $S$.

My proof goes as follows:

Proof: Let $X$ be a random variable that counts the number of edges between vertices in $S$.
For each pair of vertices $\{i,j\}$ in $S$ let $X_{i,j}$ be an indicator random variable for when there is an edge connecting $i$ and $j$. That is $$ X_{i,j} = \begin{cases} 1 & \text{if $ij \in E(G_{n,p})$} \\ 0 & \text{otherwise} \end{cases} $$ We have $$ \mathbb{E}[X] = \binom{|S|}{2}p \leq \frac{|S|^2}{2} p. $$ By Markov's inequality we have $$ \mathbb{P}[X > 2|S|] < \frac{\mathbb{E}[X]}{2|S|} \implies \mathbb{P}[X \leq 2|S|] = 1 - \mathbb{P}[X > 2|S|]. $$ Hence, $$ \mathbb{P}[X \leq 2|S|] \geq 1 - \frac{\mathbb{E}[X]}{2|S|} \geq 1 - \frac{d|S|^2}{4|S|n } \geq 1 - \frac{d}{4\log n}. $$ Since $d = O(1)$, as $n$ goes to infinity $\frac{d}{4\log n}$ goes to $0$.

Q.E.D.

However, unless I'm missing an error in my proof (I may very well be making an obvious error), couldn't one replace the $2$ in $2|S|$ with $O(1)$ ? Why was $2$ the constant that was chosen here?

Thank you in advance for your help and your time.

2

There are 2 best solutions below

8
On BEST ANSWER

Let $p = d/n$ where $d = O(1)$ and let $E(S)$ be the number of edges contained in $S$.

You proved that for each $S\subseteq [n]$ such that $|S| \leq n/\log n$, w.h.p, in $G_{n,p}$, we have $E(S) \leq 2|S|$. Then you observed correctly that the bound could be improved.

However, the exercise has the order of quantifiers reversed: You are asked to show that w.h.p, the bound $E(S) \leq 2|S|$ holds simultaneously for all $S\subseteq [n]$ such that $|S| \leq n/\log n$. For this, Markov's inequality will not suffice.


Given $b>1$ and $S$ of size $k$, $$P_p(E(S) >b|S|)\le {{k \choose 2} \choose \lceil bk \rceil} \Bigl(\frac d n\Bigr)^{\lceil bk \rceil} \,.$$ In particular, for $b>1$ the RHS vanishes if $k<4$. Using the standard inequality ${m \choose k} \le (me/k)^k$, we infer that $$P_p(E(S) >b|S|)\le \Bigl(\frac {ke} {2b} \Bigr)^{ bk } \cdot \Bigl(\frac d n\Bigr)^{ bk } = \Bigl(\frac {kde}{2b n}\Bigr)^{ bk }\,.$$

Consider the event $$A_k(b):=\bigcup_{|S|=k} \{E(S) \ge b|S|\}\, .$$ We have $$P_p(A_k(b)) \le {n \choose k} \Bigl(\frac {kde}{2b n}\Bigr)^{ bk } \le \Bigl(\frac {ne} {k} \Bigr)^{k} \Bigl(\frac {kde}{2b n}\Bigr)^{ bk }=\Bigl(\frac {d^b e^{b+1}}{ (2b)^b} \cdot \frac {k^{b-1} }{ n^{b-1}}\Bigr)^{ k } \,. $$ If $b>1$ is fixed and $k\le \frac n{\log n}$, then the RHS tends to zero, and this remains true if we sum the RHS over all $4 \le k \le \frac n{\log n}$.

Thus in the original exercise, the factor 2 can indeed be replaced by any fixed $b>1$.

1
On

You have not proved that w.h.p. for all $S \subseteq [n]$ with $|S| \le n/\log n$, the property holds. You have only shown that if we pick any particular $S \subseteq [n]$, then w.h.p. the property you want will hold for that $S$. This is the difference between saying "w.h.p. I will not win the lottery" and "w.h.p. nobody will win the lottery" - you have shown that sets $S$ with $e(S) > 2|S|$ are rare, but not that they don't exist.

With the correct statement, the value of the constant does matter. For example:

  • We know that the statement does not hold if we replace "$e(S) \le 2|S|$" by "$e(S) \le 0.499|S|$". Just take $S$ to be the set of any two adjacent vertices.
  • With more work, we can show that the statement does not hold if we replace "$e(S) \le 2|S|$" by "$e(S) \le 0.999|S|$". For this, it's enough to let $S$ be the vertex set of a short cycle in $G_{n,p}$ (which does exist when $p = \frac dn$, but is harder to find).