Say that I have multiple sets such as:
[0 0.1 0.2 -0.1 -0.001 0.5 1.0 2.0]
[0.1 0.3 0.5 0.1 0.4 -0.2 0.5]
......
Where there can be as many as 50 separate sets. My question is: is the set with the highest sum always guaranteed to have the lowest CV (coefficient of variation)?
CV = std_dev/mean
I have analyzed many different sets and found that the one with the highest sum usually has the highest CV. BUT intuitively I think that a high sum should not guarantee the lowest CV - if for instance there are significant outliers in the data set. Can anybody prove or disprove that the set with the highest sum also has the lowest CV?
- assume sets are same length and are of somewhat low variance
The sets represent profits and losses of an algorithmic stock trading strategy. The strategy has top-level parameters which lead to corresponding profits and losses. I analyze these sets to determine the top-level parameters with the highest performance. I ask the question because in evaluating the sets - I choose either the one with the highest sum, or the one with the lowest CV. The highest sum is much less computationally expensive, whereas the CV requires me to store the sets while the algorithm is running and takes up more memory and CPU. The advantage of a low CV is that it guarantees more steady returns. Whereas a high sum can have sporadic behavior.
No - there are many counter-examples. Suppose you have the data
Then the first has mean $0.8$ and standard deviation $0.2$ (using the $\frac1{n-1}$ method) so a coefficient of variation of $\frac{\text{standard deviation}}{\text{mean}}=0.25$
While the second has mean $0.5$ and standard deviation $0.1$ so a coefficient of variation of $0.2$, which is lower despite the lower mean
Meanwhile $-0.3,-0.3,-0.1,0.1,0.1$ might be an example of Ross Millikan's objection, which would have mean $-0.1$ and standard deviation $0.2$, so a ratio of $-2$ which is even lower; some people might think this is not suitable as a measure describing itself as a coefficient of variation
An even worse behaved example could be $-0.1,-0.1,0.0,0.1,0.1$ so mean $0$ and standard deviation $0.1$, suggesting a ratio which might be $\pm\infty$ and this cannot be meaningfully described as higher or lower than the others