Deriving Concentration of Measure Given Conditional Concentration

158 Views Asked by At

I have a question regarding a couple of concentrated random variables. I have tried abstracting the original setting in order to ask the question here, reducing the details to what I believe is all that should be necessary.

Consider a filtration $\mathcal{F}_0 \subset \mathcal{F}_1 \subset \dots$ (where $\mathcal{F}_0$ is the trivial $\sigma$-field), a sequence of (subgaussian or subexponential) random variables $(X_i)_{i \geq 0}$, a sequence $(Y_i)_{i \geq 0}$ such that $Y_i = \mathbb{E}[X_i | \mathcal{F}_{i-2}]$ for every $i \geq 2$, and $\varepsilon \in (0, 1]$ such that the following holds: \begin{align} (i) \quad \mathbb{P}(|X_{i+1} - Y_{i+1}| \geq \varepsilon | \mathcal{F}_{i-1}) &\leq \exp{\left(-C \min\left\{\frac{\varepsilon}{X_i}, \frac{\varepsilon^2}{X_i^2}\right\}\right)} \\ (ii) \quad \mathbb{P}(X_i \geq \varepsilon) &\leq \exp{(-\tilde{C} \varepsilon^2)} \end{align} for $i \geq 0$. Then, combine these facts to show $$ \mathbb{P}(|X_{i+1} - Y_{i+1}| \geq \varepsilon) \leq \exp{(-\bar{C} \varepsilon^2)}, $$ where $C, \tilde{C}, \bar{C}$ are nonzero constants.

I think this may somehow work inductively, although I cannot explain why you can apparently disregard the conditioning... Is there a way to relate the second condition with the first to somehow get rid of the conditioning? Any help would be greatly appreciated.


Edit: I did realize that by substituting, one gets (for small $\varepsilon$) $$ \mathbb{P}(|X_{i+1} - Y_{i+1}| \geq \varepsilon X_i | \mathcal{F_{i-1}}) \leq \exp{(-C\varepsilon^2)}. $$

Additionally, using Mr.Gandalf Sauron's hint below, one can get rid of the conditioning by taking expectation, and using the monotonicity of the expectation one gets \begin{align} \mathbb{P}(|X_{i+1} - Y_{i+1}| \geq \varepsilon) &= \mathbb{E}[\mathbb{P}(\dots|\mathcal{F}_{i-1})] \\ &\leq \mathbb{E}[\exp{(-C \min\{\frac{\varepsilon}{X_i}, \frac{\varepsilon^2}{X_i^2}\})}] \\ &= \int_0^{\infty} \mathbb{P}(\exp{(-C \min \{\frac{\varepsilon}{X_i}, \frac{\varepsilon^2}{X_i^2}\})} \geq t) dt \end{align} Now, if I could say that $1/X$ is again subgaussian and the squared term in the minimum is the smaller one, then this would follow from the properties of the MGF of subgaussian r.v.'s. Can I say this though? I don't think it is clear that the inverse is again subgaussian, is it...?


Edit 2: I realized the $Y_i$'s should not be deterministic, but are conditional expectations themselves. Changed above.