I need to calculate the following expression:
$$\sum_{k=1}^N a_k b_k$$
I know the average values of $a_k$ , defined as $\overline {a_k} = {\sum_{k=1}^N a_k \over N } $ and $b_k$ , defined as $\overline {b_k} = {\sum_{k=1}^N b_k \over N } $.
I don't know the standard deviation but one extra information that I have is that with some accuracy, I can say that all the population $k=,..,N$ are in one of the three different states and I know that what fraction are in each states. In terms of numbers, it means that $a_k$ can only have 3 values. I don't know those values, but I know that for instance, 80% of N have the first value, $a_1$, 19% have the value of $a_2$ and 1% the value of $a_3$. The same kind of information is provided for $b_k$
If only knowing these quantities, I have to make some approximation, I would like to know how much error I am producing with that approximation. $N$ is relatively big.
Any help is appreciated. :)
Narj
One possibility, is a crude estimate of $\sum a_k b_k$ as $N \, \overline{a}\,\overline{b}$.
If we have $A_k, B_k$ as the sequences when $a_k, b_k$ are arranged in ascending order, we have the bounds (by Rearrangement inequality), $$\sum A_k B_{n-k+1} \le \sum a_k b_k \le \sum A_k B_k$$
We also have the following bounds (by Chebyshev inequality), $$\sum A_k B_{n-k+1} \le \frac1N \left(\sum a_k \right) \left(\sum b_k \right) = N \, \overline{a}\,\overline{b} \le \sum A_k B_k$$
So both numbers are in the same (albeit possibly large) interval. Unfortunately it is possible for both numbers to be at opposite extremes, unless you have some measure of how they could be spread and correlated. Not sure if you can do any better with the information at hand.