n tasks assigned to n computers, what is the EX value of a computer getting 5 or more tasks?

121 Views Asked by At

Say a central server assigns each of n tasks uniformly and independently at random to n computers connected to it on a network. Say a computer is ‘overloaded’ if it receives 5 or more tasks.

Q: Calculate the expected number of overloaded computers.

I thought of doing [1 - Pr(a computer is not overloaded)] but that leads me to a complicated expression of: $$1 - PR(NotOver) = 1 - \sum_{i=0}^4 \left( \frac{1}{n} \right)^{i} { \left( \frac{n-1}{n} \right)}^{n-i}$$ multiplying this by n would(hopefully) give the Expected value. But the answer seems not very elegant atall, is there something I'm missing or an easier way to tackle this? Thanks!

2

There are 2 best solutions below

0
On BEST ANSWER

Hagen von Eitzen suggestion of using the Poisson distribution in fact provides an upper bound.

For large $n$, with an expected number of tasks per computer of $\frac{n}{n} = 1$, the probability that a single computer is not overloaded is about $e^{-1}\left(1 + \frac{1}{1!}+ \frac{1}{2!}+ \frac{1}{3!}+ \frac{1}{4!}\right) \approx 0.99634015317$ and so the probability a single computer is overloaded is about $0.00365984683$.

For smaller $n$, the probability a single computer is overloaded is less than this, because high numbers of tasks (many more than the expectation) going to a single computer is less likely with smaller $n$.

So $0.00365984683$ is an upper bound on a single computer being overloaded, and by linearity of expectation $0.00365984683n$ is an upper bound on the expected number of computers being overloaded.

For what it is worth, the probabilities a single computer is overloaded for different $n$ are

n      probability 
1      0
2      0
3      0
4      0
5      0.00032
6      0.0006644376
7      0.0009701983
8      0.0012300611
9      0.0014492806
10     0.0016349374
20     0.0025739403
30     0.0029210503
40     0.0031003767
50     0.0032097420
100    0.0034323216
1000   0.0036368780
10000  0.0036575478
100000 0.0036596171
2
On

I believe your formula for the P(overloaded) is wrong. It looks like you are trying to calculate

1 - (P(0 tasks) + P(1 task) + ... + P(4 tasks))

However the probability of k tasks is ${n \choose k}\left(\frac{1}{n}\right)^k\left(\frac{n-1}{n}\right)^{(n-k)}$, not simply $\left(\frac{1}{n}\right)^k\left(\frac{n-1}{n}\right)^{(n-k)}$. You are missing the binomial coefficient.

So the exact formula for E[overloaded] is:

$n\left(1-\sum_{i=0}^4 {n \choose i}\left(\frac{1}{n}\right)^i\left(\frac{n-1}{n}\right)^{(n-i)}\right)$

To answer your question, I don't believe there are any tricks to simplify this any further.

Consider using the Multiplicative Chernoff Bound to estimate an upper bound on the probability of one computer being overloaded and then multiplying by $n$ to get an upper bound on the expected number of overloaded computers.

Let $X_i$ be a random variable describing the number of tasks assigned to computer $i$. We have $E(X_i) = 1$, and the Chernoff bound of the probability of $X_i$ being overloaded is given by:

$$P(X_i > 4) = P \Big( X_i > (1+3) \cdot 1 \Big) < \frac {\textrm e^3} {(1+3)^{1+3}}$$

and so the expected number of overloaded computers can by bounded by:

$$E \left( \sum X_i \right) = \sum E(X_i) < n \frac {\textrm e^3} {(1+3)^{1+3}} \sim 0.078 n .$$

This seems to be a loose bound, however. For example simulating in Python for n=1000000 gives 0.0037*n:

>>> n = 1000000; t = [randint(1,n) for i in range(n)]; c = Counter(t); sum(1 for v in c.values() if v > 4)/n    
0.003653