Lower bounding the variance over a finite alphabet

27 Views Asked by At

Let alphabet $\mathcal{X} = \{0, 1, x_1, x_2, x_3,\ldots, x_n \}$, where $x_1,x_2,x_3, \ldots, x_n$ are free parameters which can be chosen in any way I want.

Let $p$ be any full-support distribution over $\mathcal{X}$ such that $p(x) \geq \sigma$ for all $x \in \mathcal{X}$ for some fixed $\sigma > 0$, (where obviously $\sigma < \frac{1}{n + 2}$.)

Ley $Y$ be the random variable taking values in $\mathcal{X}$ with distribution $p$.

I need to uniformly lower bound the variance of $Y$ for any choice of $x_1, x_2, \ldots, x_n$ and any choice of $p$. The lower bound should only depend on $\sigma$. Any crude nonzero lower bound can work.

Intuitively, I feel like I can set $x_1 = \cdots = x_n = \frac{1}{2}$, set $p(0) = p(1) = \sigma$ and set $p(1/2) = 1 - 2 \sigma$.

I verified this is the right answer numerically for $n = 1$ and $n = 2$.

Any ideas how to prove it, assuming correctness.

1

There are 1 best solutions below

0
On BEST ANSWER

The best possible lower bound given the criteria in the problem is simply \begin{align} \mathbb{V}[Y] &= p(0)(0 - \mathbb{E}[Y])^2 + p(1)(1 - \mathbb{E}[Y])^2 + \sum_{x \in \mathcal{X}: x \neq 0,1} p(x)(x - \mathbb{E}[Y])^2 \\ &\geq \sigma(0 - \mathbb{E}[Y])^2 + \sigma(1 - \mathbb{E}[Y])^2 \\ &\geq \frac{\sigma}{2} \end{align} The first line is just writing out the variance of $Y$, the second line is using the fact that the last summation is non-negative, and the third uses \begin{equation} \min_{z} \sigma z^2 + \sigma(1-z)^2 = \min_z 2\sigma z^2 - 2\sigma z + \sigma = \sigma/4 \end{equation} which can be shown with a little calculus (take a derivative, set it to zero and find the best $z$).

To see why this is the best possible, let $p(0) = p(1) = \sigma$ and let $x_i \in (1/2-\epsilon,1/2+\epsilon)$ for any $\epsilon > 0$. A short calculation can show that $\mathbb{E}[Y] \in (1/2-\epsilon,1/2+\epsilon)$ and therefore \begin{align} \mathbb{V}[Y] &= p(0)(0 - \mathbb{E}[Y])^2 + p(1)(1 - \mathbb{E}[Y])^2 + \sum_{x \in \mathcal{X}: x \neq 0,1} p(x)(x - \mathbb{E}[Y])^2 \\ &\leq \sigma(0 - \mathbb{E}[Y])^2 + \sigma(1 - \mathbb{E}[Y])^2 + \sum_{x \in \mathcal{X}:x\neq 0,1} p(x)(2\epsilon)^2 \\ &\leq \sigma(0 - \mathbb{E}[Y])^2 + \sigma(1 - \mathbb{E}[Y])^2 + 4\epsilon^2 \end{align} Here the first inequality is that $x,\mathbb{E}Y \in (1/2-\epsilon,1/2+\epsilon) \implies |x-\mathbb{E}Y| \leq 2\epsilon$, and then squaring that bound. The second inequality uses that $\sum_{x\in \mathcal{X}:x\neq 0,1} p(x) \leq 1$.

Setting $x_i, p(x_i)$ so that $\mathbb{E}[Y] = 1/2$ shows that for any $n$ and any $\epsilon > 0$ we can construct random variables with the properties asked for such that \begin{equation} \frac{\sigma}{2} \leq \mathbb{V}[Y] \leq \frac{\sigma}{2} + 4\epsilon^2 \end{equation} Letting $\epsilon \rightarrow 0$ we can make this bound arbitrarily tight, showing that the best possible bound is $\sigma/2 \leq \mathbb{V}[Y]$.