I have a web application in which users can provide valuations for items. These valuations are currently unrestricted, and there is no non-arbitrary criteria by which I could restrict them, since valuations are purely subjective. For each item, I have an "average valuation", which is currently just a simple mean of all valuations for the item. This was a naive choice that has inevitably lead to abuse, whereby a small subset of users intentionally provide extreme valuations to manipulate this "average valuation" to their benefit.
I'd like to solve this problem mathematically, without imposing arbitrary restrictions on submitted valuations. As far as I can tell, it's very commonplace for item valuations to follow a normal distribution, but I admittedly don't remember much of my statistics from decades ago. I have been attempting to brush up on standard deviations, but it seems like I can't exclude valuations over 2σ because these extreme outliers raise the population standard deviation (I could be misunderstanding this concept, though). So...
What statistical approach can I take to determine the "average valuation", while remaining resistant to this kind of abuse?
For reference, here is an example set of valuations for a particular item. The outliers here are clearly the [317, 318, 630, 630, 640, 6511] subset:
60, 63, 63, 63, 63, 63.5, 63.8, 63.8, 63.9, 63.9, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64.2, 64.5, 64.5, 64.5, 64.5, 64.5, 64.9, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66.5, 67, 67, 67, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 69, 69, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 71, 72, 74, 74, 75, 75, 75, 75, 78, 80, 85, 317, 318, 630, 630, 640, 6511
For this set of valuations, I would expect an "average valuation" somewhere around 65 or 66.
The median, as suggested by @Henry seems the simplest solution to your problem. You might also consider a 'trimmed mean'. Below I show the mean, a 5% trimmed mean (average the middle 90% of your data after discarding the top and bottom 5%), and median. (In a sense, the median is a 50% trimmed mean.) You could try out various degrees of trimming to see what works best in your situation.
I pasted your data into R statistical software with the following results.
Both the trimmed mean and the median would also provide protection if there were users who purposely gave absurdly low values.
Another approach would be to use a 'boxplot' to detect 'outliers', ignore the outliers, and average the rest. The default outlier-detection method may include as outliers some values you would want to keep. (You could adjust the outlier rule to be less aggressive.)
The procedure below ignores the possibility of low outliers, and only omits the high ones. It has some potential of giving a result biased on the low side.
My guess is that you want something simple and automatic. I would probably use the ordinary mean, 5% trimmed mean, and median in tandem for a while and then pick one of the latter two, depending on track record.