Question. Is it possible for most (all?) items of a list to lie within one standard deviation of the mean of that list?
Motivation. Fast computations of approximate medians of lists of integers is considered an important problem in computer science. My personal interest in the problem comes from the fact that if a quicksort implementation uses a good approximations to the median for choosing its pivot elements, the worst-case time complexity will be $O(n \log n)$, which is considered fast.
With that motivation established, recall that the nonparametric skew of a distribution is defined as $$\frac{\mu - \nu}{\sigma},$$ where $\mu$ is the mean, $\nu$ is the median, and $\sigma$ is the standard deviation. According to the relevant wikipedia page:
The nonparametric skew ... lies between $-1$ and $+1$ for any distribution
Obviously, this applies to lists as well, and not just probability distributions. Therefore, I'm interested in using this to quickly find the median of a list $L$ of not-necessarily positive integers. Here's the idea: if the standard deviation of $L$ is zero, the first item of the list is the median. Otherwise, we can omit elements that aren't one standard deviation of the mean, recalculate $\mu$ and $\sigma$, and repeat the process. A related idea is to stop after a certain portion of the list items have been omitted by this process, in the hope that a randomly chosen element of the remaining elements will provide a good approximation to the median.
Here's the problem: the process of omitting elements needs to be guaranteed to omit a decent portion of the list at each iteration. Otherwise, the worst-case time complexity of this strategy will be very bad. Indeed, if there exist lists in which all the points are within one standard deviation of the mean, the aforementioned algorithm will never terminate for those lists.
In a degenerate case, it's possible: in a list of integers which has only two values $a$ and $b$, appearing equally often, the mean will be $\frac{a+b}{2}$ and the (population) standard deviation will be $\frac{|a-b|}{2}$, and both values will be within a standard deviation of the mean.
We can prove that this is the only situation in which this happens (for a probability distribution). If $\Pr[|X-\mu| \le \sigma] = 1$, then $\Pr[(X-\mu)^2 \le \sigma^2] = 1$, so the random variable $(X-\mu)^2$ must almost surely be at most its expected value: $\sigma^2$, by definition. This can only happen if it's almost surely equal to its expected value $\sigma^2$, so we must have $X \in \{\mu-\sigma, \mu+\sigma\}$ with probability $1$.
More concerning are situations where most values lie within a standard deviation of the mean, which can happen easily when the data has extreme outliers. We might hope that in the outlined median-finding algorithm, one phase will get rid of outliers, and then the next phase will work much better. But I can imagine scenarios where that won't happen.
For example, consider the following data set of $2n+1$ elements, with mean and median $0$: $$ -2^{2^n}, -2^{2^{n-1}}, -2^{2^{n-2}}, \dots, -2^{2^1}, 0, 2^{2^1}, \dots, 2^{2^{n-2}}, 2^{2^{n-1}}, 2^{2^n}. $$ (Here, $f(x) = 2^{2^x}$ is just one possible extremely fast-growing function.)
The extreme values $\pm 2^{2^n}$ are so large that they make the standard deviation extremely large, which means that all other values fall within a standard deviation of the mean. So the algorithm will drop those two extremes from our data set, and leave the other $2n-1$ values. But then, we're on the $n-1$ case of the same pattern, and the same behavior happens again; it takes $O(n)$ phases to identify the median of $0$.
Whether this is likely or not depends on the kind of distribution your numbers come from.