Similar to:
Average of products VS. product of averages
I have
$$ R = \{(s_1, t_1), (s_2, t_2), ..., (s_n, t_n)\} $$
With the property that, for all pairs $(s_i, t_i)$, $0 \le s_i < t_i$.
$$ \forall (s_i, t_i) : 0 \le s_i < t_i $$
The value of one pair, $(s_i, t_i)$ is $s_i \over t_i$
I want to select an $m$ sized subset where $m < n$ that maximizes value.
One way to compute the value of a subset $P \subseteq R$ is to take the mean of the values of the individual pairs:
$$p(P) = {1 \over m} {\sum_{i=1}^m {s_i \over t_i}}$$
Another way is to compute the value of the summed individual elements from the pairs:
$$q(P) = {\sum_{i=1}^m {s_i} \over \sum_{i=1}^m {t_i}}$$
Let $P'$ be the subset $P$ of size $m$ such that $p(P)$ is maximized. (This is easy. Select the $m$ elements of $R$ with highest value.)
Let $Q'$ be the subset $Q$ of size $m$ such that $q(Q)$ is maximized. (I think this requires combinatorial search.)
What is the relationship between $p(P')$ and $q(Q')$?
Will $p(P') \le q(Q')$ always?
Which is the truer maximum value?
An example with $p(P') > q(Q')$ is this: Let $n=3,m=2$, and define the set: $$ \{(1,1), (1,200), (10^{-6}, 10^6)\} $$ where the pair $(10^{-6}, 10^6)$ was thrown in as an undesirable option to ensure $m<n$. We get: \begin{align} p(P') &= \frac{1}{2}\left[\frac{1}{1} + \frac{1}{200}\right] = 0.5025 \\ q(Q') &= \frac{1 + 1}{1+200} \approx 0.00995 \end{align}
Now, your question about "which is the truer maximum value" does not really make sense without a definition of "truer." The $p()$ and $q()$ functions are just different functions and so they can have different max values.
To find an $\epsilon$-approximation to the optimal set $Q'$ very quickly, you can do the following. (This is similar to what I did for stochastic ratio problems in Section V of this paper: http://ee.usc.edu/stochastic-nets/docs/renewal-optimization.pdf)
Problem setup:
-Fix $n$ and $m$ as a positive integers with $m \leq n$.
-Fix real numbers $\{s_1, ..., s_n\}$.
-Fix positive real numbers $\{t_1, ..., t_n\}$.
Define $\mathcal{A}_m$ as the set of all subsets of $\{1, ..., n\}$ that contain $m$ elements. Define: \begin{align} Q' &= \arg \max_{Q \in \mathcal{A}_m} \left[\frac{\sum_{i\in Q} s_i}{\sum_{i\in Q} t_i} \right] \\ M &= \max_{Q \in \mathcal{A}_m} \left[\frac{\sum_{i\in Q} s_i}{\sum_{i\in Q} t_i} \right] \end{align}
Fix $\epsilon>0$. We want to find a set $Q \in \mathcal{A}_m$ that gives $\frac{\sum_{i\in Q} s_i}{\sum_{i\in Q} t_i} \geq M-\epsilon$.
Solution approach:
Define the following function $g:\mathbb{R}\rightarrow\mathbb{R}$ by: $$ g(\mu) = \max_{Q \in \mathcal{A}_m} \left[\sum_{i \in Q} s_i - \mu \sum_{i\in Q} t_i\right] $$
It is not difficult to show:
1) For a given $\mu \in \mathbb{R}$, the value $g(\mu)$ can be easily found by selecting the $m$ indices in $\{1, ..., n\}$ with the largest values of $(s_i -\mu t_i)$.
2) The function $g(\mu)$ is strictly decreasing in $\mu$.
3) $g(\mu)=0$ if and only if $\mu=M$.
So finding an $\epsilon$-approximation reduces to approximating the zero of the decreasing function $g(\mu)$. We can just run a simple bisection search.