TL;DR
Given some finite set of data where each datapoint is a vote on what the individuals giving the vote think is true, and given that collusion and manipulation is possible is there an optimal formula for calculating what is most likely true given just that set of data?
I know this is a really vague question, so I'll explain what I mean with an example, and you can tell me if there is a proof somewhere saying this kind of thing is impossible or not. Though this is a specific example, the applications are general and endless so I'm sure people have thought about this, I just don't know what language they use to describe it.
Pretend you have a company of 5 people. They decide to decide their pay in a different way than most.
Each person voices their opinion about how much they think they should get paid, and how much they think everyone else should get paid (as a percentage of how much revenue the company makes, we'll say).
So the results might look something like this:
Alice, Bob, Carly, Dan, Elmer
Alice 100 0 0 0 0
Bob 20 20 20 20 20
Carly 10 10 60 10 10
Dan 10 10 10 40 30
Elmer 10 10 10 30 40
Alice, as you can see is greedy, she says she should make all the profits and the others should get nothing. Bob is wants to share the profits evenly. Carly wants to make most the profit but give a little to everyone else, and Dan and Elmer may or may not have colluded thinking, perhaps if we vote each other up, along with ourselves we'll make more money.
My question is, 'if you were in charge of using these numbers, and only these numbers to know how much everyone should get paid, what would be the most balanced and even way of deciding? That includes the consideration of collusion.'
You might first decide to merely take the average for each:
Alice, Bob, Carly, Dan, Elmer
Alice 100 0 0 0 0
Bob 20 20 20 20 20
Carly 10 10 60 10 10
Dan 10 10 10 40 30
Elmer 10 10 10 30 40
------------------------------------
total 150 50 100 100 100
avg 30 10 20 20 20
This doesn't seem fair because the obvious optimal strategy is to merely vote 100% for yourself and if everyone did that, no new information would be learned.
So it seems Perhaps a better way to balance the score would be to first ask - how much was your prediction of how much you should make off when compared to how much the rest of the group thought you should make?
avg group
self group norma error
eval eval lized (diff)
Alice 100 12.5 20.83 -79.17
Bob 20 7.5 12.5 -7.5
Carly 60 10 16.67 -43.33
Dan 40 15 25 -15
Elmer 40 15 25 -15
Notice that we haven't yet gotten to a final score, but so far it looks as though Dan and Elmer's alleged scheming may have given them more money than anyone. If we keep going in this vein they may have won via collusion. We don't want to turn the group into a dictator, because if that's the case if you use collusion to control the group you control the allocation of value.
Here's where I start getting lost. Is what I'm attempting to do really impossible? is there no optimal way to evaluate these numbers to come up with the fairest solution? Or has the formula for this already been discovered and is ubiquitous, though unknown to me?
Were I to continue I'd probably do the above calculation again on every group of pairs, then every triplet, then every Quartet, getting their errors of what the rest of the group thought they should make vs what they voted for themselves.
Once I had all those differences I'm not sure what that would do for me. I assume I would then combine them in some way, perhaps from largest to smallest (group size) to determine a balanced figure for each individual person. But that's about as detailed as my vision is.
This question really revolves around extracting information out of the data, now that I think about it. We don't know who's the boss, we don't know what anyone does in this hypothetical company. All we know is how they voted. Given that information is there an optimal formula for calculating a balance between those votes? What is most likely true given that set of data?
Seems like this would be benefited if it were an iterative process, like the evaluation of Bayes Theorem.
Thank you for your patience and any thoughts or literature I should read would be greatly appreciated. As a layman I often don't know the words or language that everyone in academia uses surrounding the topics and questions I have, so any direction is appreciated.