I have read in a book that the follow should hold, but can't seem to prove it.
Let $ X_i $ be iid random variable where $ P(X_i > t) = e^{-t}$. Define $$ Y_i = 1 \text{ if }|X_{i+1} - X_i| > \log(n) \text{ for } i\leq n$$ and $ Y_i = 0 $ otherwise. Let $ W_n = \sum_{i}^{n}Y_i$. Then $ W $ is distributed as a compound poisson variable.
I really can't figure out how to get this result, I assume the underlying proof should resort to some Stein method or something, but I can't see it myself.
Any help appreciated! Thanks
We claim that
Let $(X_i)_{i\geq 1}$ be a sequence of i.i.d. $\operatorname{Exp}(1)$ variables, and define the events $A_{n,i}$ and random variables $S_{n,k}$ by
$$ A_{n,i} = \{ \left| X_{i+1} - X_i \right| > \log n\} \qquad\text{and}\qquad S_{n,k} = \sum_{i=1}^{k} \mathbb{I}[A_{n,i}]. $$
Here is an intuition: Conditioned on $S_{n,k} \geq 1$, the value of $S_{n,k}$ is realized by a single 'outlier' $X_i$ satisfying $X_i > \log n$. Closely inspecting the contribution from that single outlier, we obtain the desired assertion of the lemma. The actual proof of the lemma is postponed to the end.
Now fix a sequence $(k_n)$ in $\mathbb{Z}_{>0}$ so that $2 \leq k_n \leq n$ and $1 \ll k_n \ll n$ as $n\to\infty$. We introduce
$$ \tilde{W}_n = \sum_{q=0}^{\lfloor n/k_n\rfloor-1} \tilde{S}_{n,q}, \qquad \text{where} \qquad \tilde{S}_{n,q} = \sum_{r=1}^{k_n-1} \mathbb{I}[A_{n,k_nq + r}]. $$
It is clear that $\tilde{S}_{n,0}, \tilde{S}_{n,1}, \tilde{S}_{n,2}, \ldots$ are identically distributed. Moreover, since each $\tilde{S}_{n,q}$ depends only on $X_{k_n q + 1}, \ldots, X_{k_n q+ k_n}$, they are independent. From this, the m.g.f. of $\tilde{W}_n$ is computed by
\begin{align*} \mathbf{E}[e^{s \tilde{W}_n}] &= \prod_{q=0}^{\lfloor n/k_n\rfloor-1} \mathbf{E}\left[ e^{s\tilde{S}_{n,q}} \right] = \mathbf{E}\left[ e^{s\tilde{S}_{n,0}} \right]^{\lfloor n/k_n\rfloor} \\ &= \left( 1 + \mathbf{E}\left[ e^{s\tilde{S}_{n,0}} - 1 \, \middle|\, \tilde{S}_{n,0} \geq 1 \right]\mathbf{P}(\tilde{S}_{n,0} \geq 1) \right)^{\lfloor n/k_n\rfloor}. \end{align*}
So, letting $n \to \infty$ and applying the lemma,
$$ \mathbf{E}\left[ e^{s\tilde{S}_{n,0}} - 1 \, \middle|\, \tilde{S}_{n,0} \geq 1 \right] \xrightarrow{n\to\infty} \int_{\mathbb{R}} (e^{sx} - 1) \, \nu(\mathrm{d}x). $$
and hence
$$ \lim_{n\to\infty} \mathbf{E}[e^{s \tilde{W}_n}] = \exp\left( \frac{2}{3} \int_{\mathbb{R}} (e^{sx} - 1) \, \nu(\mathrm{d}x) \right). $$
This is precisely the m.g.f. of the compound Poisson distribution in $\text{(*)}$, hence we know that the claim holds with $\tilde{W}_n$ in place of $W_n$.
Finally, noting that $\mathbf{E}[\mathbb{I}[A_{n,1}]] = \frac{1}{n}$ and
$$ W_n - \tilde{W}_n = \sum_{q=0}^{\lfloor n/k_n\rfloor -1} \mathbf{1}[A_{n,k_n q}] + \sum_{i = \lfloor n/k_n\rfloor k_n}^{n} \mathbf{1}[A_{n,i}], $$
we have
$$ \mathbf{E}[|W_n - \tilde{W}_n|] \leq \frac{\lfloor n/k_n\rfloor}{n} + \frac{k_n}{n} \xrightarrow{n\to\infty} 0. $$
This implies that $W_n$ and $\tilde{W}_n$ converges to the same limit in distribution, and therefore the desired claim follows. $\square$
Proof of Lemma. In order to have $S_{n,k} \geq 1$, at least one of $X_1, \ldots, X_{k+1}$ must be at least $\log n$. Let $\mathcal{I}_{n,k}$ be a random index set defined by
$$ \mathcal{I}_{n,k} = \{ i \in [k+1] : X_i > \log n\}, $$
where $[k+1] = \{1,2,\ldots,k+1\}$. Then for each $E \subseteq \mathbb{Z}_{>0}$,
\begin{align*} \mathbf{P}( S_{n,k} \in E ) = \mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} \neq \varnothing ) = \sum_{\substack{I \subseteq [k+1] \\ I \neq \varnothing}} \mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = I ). \end{align*}
However, for each non-empty $I \subseteq [k+1]$, we have
\begin{align*} \mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = I ) \leq \mathbf{P}( I \subseteq \mathcal{I}_{n,k} ) = \mathbf{P}(\cap_{i \in I} \{ X_i > \log n \}) = \frac{1}{n^{|I|}}. \end{align*}
Plugging this bound into the sum above, for $1 \leq k \leq n$, we obtain
\begin{align*} \left| \mathbf{P}( S_{n,k} \in E ) - \sum_{i=2}^{k} \mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = \{i\} ) \right| &\leq \frac{2}{n} + \sum_{j=2}^{k+1} \binom{k+1}{j}\frac{1}{n^j} \\ &\leq C\left(\frac{1}{n}+\frac{k^2}{n^2}\right) \tag{L.1} \end{align*}
for some absolute constant $C > 0$. On the other hand, when $\mathcal{I}_{n,k} = \{i\}$ is a singleton, then only $\mathbb{I}[A_{n,i-1}]$ and $\mathbb{I}[A_{n,i}]$ can contribute to the value of the sum $S_{n,k}$. So, if $2 \leq i \leq k$, then by the symmetry,
\begin{align*} &\mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = \{i\} ) \\ &= \mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = \{2\} )\\ &= \mathbf{P}( \mathbb{I}[A_{n,1}] + \mathbb{I}[A_{n,2}] \in E, \ \mathcal{I}_{n,k} = \{2\} ) \\ &= \mathbf{P}( \mathbb{I}[A_{n,1}] + \mathbb{I}[A_{n,2}] \in E , \ X_1, X_3 \leq \log n \mid X_2 > \log n ) \\ &\quad \times \mathbf{P}(X_2 > \log n) \mathbf{P}(\cap_{j=4}^{k+1} \{ X_j \leq \log n\}). \end{align*}
So by writing $\tilde{X}_2 = X_2 - \log n$, then the law of $\tilde{X}_2$ conditioned on $\{X_2 > \log n\}$ is again $\operatorname{Exp}(1)$ by the memorylessness property, and so,
\begin{align*} &\mathbf{P}( S_{n,k} \in E , \ \mathcal{I}_{n,k} = \{i\} ) \\ &= \mathbf{P}( \mathbb{I}[X_2 > X_1] + \mathbb{I}[X_2 > X_3] \in E , \ X_1, X_3 \leq \log n ) \frac{(1 - \frac{1}{n})^{k-2}}{n}. \tag{L.2} \end{align*}
So, using $\text{(L.1)}$ and $\text{(L.2)}$ and noting that
\begin{align*} \mathbf{P}( \mathbb{I}[X_2 > X_1] + \mathbb{I}[X_2 > X_3] = s , \ X_1, X_3 \leq \log n ) = \begin{cases} \frac{1}{3} + \mathcal{O}\left(\frac{1}{n}\right), & \text{if } s = 1, 2, \\ 0, & \text{if } s \geq 3. \end{cases} \end{align*}
we conclude
\begin{align*} \mathbf{P}( S_{n,k} = s ) = \begin{cases} \frac{k-1}{n}\left(1 - \frac{1}{n}\right)^{k-2} \left( \frac{1}{3} + \mathcal{O}\left(\frac{1}{n}\right) \right) + \mathcal{O}\left(\frac{1}{n}+\frac{k^2}{n^2}\right), & \text{if } s = 1, 2, \\ \mathcal{O}\left(\frac{1}{n}+\frac{k^2}{n^2}\right), & \text{if } s \geq 3. \end{cases} \end{align*}
Now combining altogether, we find that
$$ \mathbf{P}( S_{n,k} \in E \mid S_{n,k} \geq 1) = \frac{ \delta_{1}(E) + \delta_2(E) + \mathcal{O}(\varepsilon_{n,k})}{2 + \mathcal{O}(\varepsilon_{n,k})}, $$
where $\varepsilon_{n,k}$ takes the form
$$ \varepsilon_{n,k} = \frac{1}{n} + \frac{\frac{1}{k}+\frac{k}{n}}{\left(1 - \frac{1}{n}\right)^{k-2}}. $$
Since $\varepsilon_{n,k} \to 0$ as $k, n \to \infty$ in such a way that $1 \ll k \ll n$, the desired claim follows. $\square$