Maximum of the Variance Function for Given Set of Bounded Numbers

16.1k Views Asked by At

Let $ \boldsymbol{x} $ be a vector of $n$ numbers in the range $ \left[0, c \right] $, where $ c $ is a positive real number.

What's is the maximum of the variance function of this $ n $ numbers?
Maximum in the meaning what spread of the number will maximize the variance?
What would be a tighter bound for other assumptions on the spread of the numbers.

The variance of the vector $ \boldsymbol{x} $ is given by:

$$ \operatorname{var} (\boldsymbol{x}) = \frac{1}{n} \sum_{i = 1}^{n} {\left( {x}_{i} - \overline{\mathbf{x}} \right )}^2 $$

Where the mean $\overline{\boldsymbol{x}}$ is given by:

$$ \overline{\boldsymbol{x}} = \frac{1}{n} \sum_{i = 1}^{n} {x}_{i} $$

1

There are 1 best solutions below

4
On BEST ANSWER

Since $x_i \leq c$, $\displaystyle \sum_i x_i^2 = \sum_i x_i\cdot x_i \leq \sum_i c\cdot x_i = cn\bar{x}.$ Note also that $0 \leq \bar{x} \leq c$. Then, $$ \begin{align*} n\cdot \text{var}(\mathbf{x}) &= \sum_i (x_i - \bar{x})^2= \sum_i x_i^2 - 2x_i\bar{x} + \bar{x}^2\\ &= \sum_i x_i^2 - 2\bar{x}\sum_i x_i + n\bar{x}^2= \sum_i x_i^2 - n\bar{x}^2\\ &\leq cn\bar{x} - n\bar{x}^2 = n\bar{x}(c-\bar{x}) \end{align*} $$ and thus $$\text{var}(\mathbf{x}) \leq \bar{x}(c-\bar{x}) \leq \frac{c^2}{4}.$$

Added note: (second edit)
The result $\text{var}(X) \leq \frac{c^2}{4}$ also applies to random variables taking on values in $[0,c]$, and, as my first comment on the question says, putting half the mass at $0$ and the other half at $c$ gives the maximal variance of $c^2/4$. For the vector $\mathbf x$, if $n$ is even, the maximal variance $c^2/4$ occurs when $n/2$ of the $x_i$ have value $0$ and the rest have value $c$. Someone else posted an answer -- it has since been deleted -- which said the same thing and added that if $n$ is odd, the variance is maximized when $(n+1)/2$ of the $x_i$ have value $0$ and $(n-1)/2$ have value $c$, or vice versa. This gives a variance of $(c^2/4)\cdot(n^2-1)/n^2$ which is slightly smaller than $c^2/4$. Putting the "extra" point at $c/2$ instead of at an endpoint gives a slightly smaller variance of $(c^2/4)\cdot(n-1)/n$, but both choices have variance approaching $c^2/4$ asymptotically as $n \to \infty$.