I'm trying to solve Exercise 1.4.2 in this book. Here $G_{n,p}$ is a random graph on $n$ vertices where each edge is present independently with probability $p$.
Suppose that $p=d/n$ where $d>1$. Show that w.h.p. $G_{n,p}$ contains an induced path of length $\sqrt{\log n}$.
The only technique for such problems introduced in the book at the point is the second moment method, so I tried that. Recall this method uses the inequality $$P(X=0)\le \frac{\operatorname{Var X}}{(\mathbb E X)^2}.$$
So, if $X$ counts the number of induced paths of length $\sqrt{\log n}$, and we show that the square of its expectation dominates its variance, we're set.
Write $X = \sum_{i\in K} I_i$, where $K$ is the set of possible paths and $I_i$ is an indicator function for the presence of the induced path.
We compute $$|K| = \frac{1}{2} \frac{n!}{(n - \sqrt{\log n})!}$$
and the probability of each $I_i$ occurring is $$\mathbb E I_i =p^{\sqrt{\log n}}(1-p)^{\binom{\sqrt{\log n}}{2} - \sqrt{\log n}}.$$
The $(1-p)$ factor tends to a constant, and we get asymptotically that the first moment of $X$ is $\mathbb E X= d^{\sqrt{\log n}}$, which explains why the problem requires $d>1$.
However, it seems that the variance won't be small enough to give the result. Since all the $I_i$ are positively correlated, we can get a lower bound by asking what happens in the case where all the $I_i$ are independent. Recall the variance of a binomial variable (such as $I_i$) with expectation $a$ is $a(1-a),$ and here $a$ is $o(1)$ so we get that the variance is approximately $a|K|$. Then
$$ \frac{\operatorname{Var X}}{(\mathbb E X)^2} \approx \frac{ |K| a (1-a)}{|K|^2 a^2} \approx \frac{1}{K|a|} = \frac{1}{\mathbb E X}.$$
So ignoring all correlations, we would just barely get the result, since $\mathbb EX$ grows slowly. However, since it grows so slowly, it seems trying to account for all the correlations will inevitably spoil the proof by making the variance too large.
Am I on the right track here? Or is another method needed? I'm also curious why $\sqrt{\log n}$ works but something larger like $\log{n}$ doesn't. This restriction doesn't come out of the above calculations.
I'll begin by assuming we consider paths on $t$ vertices until we see the difference $t = \sqrt{\log n}$ makes. In general, I think it's good habit to do such a calculation for paths of arbitrary length and only then figure out the value of $t$ that will go.
Your calculation of $\mathbb E I_i$ is not quite right: you should have a factor of $p^{t-1}$ rather than $p^t$, since a $t$-vertex path has $t-1$ edges, which should mean $$\mathbb E X \sim \frac{n}{2} \cdot d^{t-1}.$$ This should be valid for $t \ll \sqrt n$, but also explains the reason for taking $t = \sqrt{\log n}$ or similar. For this value of $t$ (in fact, for any $t = o(\log n)$, but $\sqrt{\log n}$ is everyone's favorite function in this range) the linear factor of $\mathbb EX$ dominates the $d^{t-1}$ factor.
Let's see the difference this makes. When calculating $\mathbb E(X^2)$, we are counting ordered pairs of induced paths. These come in three types:
You should probably do the calculation carefully for practice, but the underlying reason why things work out is this: the third type of pair corresponds to a very specific kind of tree component on $O(t)$ vertices. We can count these in more or less the same way as paths to get that there's $n \cdot d^{O(t)}$ of these, in expectation. When $t = o(\log n)$, the $n$ term is still dominant, and we get $\operatorname{Var} X = o(n^2)$. In fact, $\operatorname{Var} X = o(n^{1+\epsilon})$ for any $\epsilon>0$.