Given array of numbers $n$, what is the probability that at least k elements will have smaller value than arithmetic average?

353 Views Asked by At

Let's say we have algorithm like quicksort. I want to analyze its average time complexity based on choosing pivot as arithmetic average. But I am struggling to find the answer for - What is the probability that at least k elements in the array of size $n$ would have lesser value than arithmetic average of this array ?

$n > 0, n \ge k \ge 0$

My basic thoughts are that expected value of random variable - number of elements which are lesser than arithmetic average of the array - is $n/2$. But how to find the actual probabilities for fixed $k$? Maybe we could say that every option of $k$ would have the same probability because there is infinite numbers of combinations how we can construct this array for arithmetic average $a$ and some fixed $k$. But basic intuitions says it is not right and I somehow can not come with the proof.

1

There are 1 best solutions below

2
On BEST ANSWER

Consider the probability mass distribution with $p(0) = 0.1$ and $p(1) = 0.9$. If you draw $n$ items from this distribution, then on average, the mean will be $0.9$. And (again on average) $10$ percent of the $n$ items will be less than this mean.

So for this distribution, with $k = n/2$, and $n$ large, the probability that $k$ of the items will be less than the mean will be vanishingly small.

Now let $q = 1- p$. That's a different probability mass distribution, but the same analysis shows that the probability that $k = n/2$ of the items will be less than the mean, for $n$ large, approaches $1$.

In short: without knowing the distribution from which the items in your array are drawn, you will not be able to determine the probability that you seek.

When we do probabilistic analysis of algorithms, we almost always run into this problem, and you'll find that almost all analyses instead base their probabilistic work on the distribution of variables over which we have control. For instance, in quicksort (or at least one version of it), we uniformly randomly pick one of the $n$ numbers in the array to use as the pivot. The probability that at least $n/2$ of the numbers are less than this uniformly-randomly-chosen item is then (roughly) $1/2$. But that's because we chose uniformly-randomly, rather that using something that's input-distribution-dependent like the arithmetic mean of the values.