In my database, each item has a total_votes and total_ranking. When a user vote, I add +1 to the total votes and add its rating (1 to 5, 5 is the highest) to the total_ranking (total_ranking = total_ranking + user_vote).
This is all the data that I have. I want a better algorithm to calculate the "best" item. Right now I just do an average, but if one item has rating of 5, it can beat another item with 5000 rating and average of 4.9.
Is there any good alternative consider that I only have those two columns.
Update: I thought about using this formula
total_votes^(average_rating / 10) + (average_rating)
Look around various retail websites and see how they treat the situation. Often times, they also will list items with very few but very good ratings above those with many ratings but slightly lower average rating. A way to combat this in practice, is to provide ways to filter results based on minimum number of votes, or perhaps use a sample standard deviation argument and compare based on one sample standard deviation lower than the true average.
Take for example Amazon.
As you can see, Amazon also follows the pattern of when sorting by rating first, the book with only 133 ratings but average rating of 4.8 appears before the book with over 2500 ratings but average rating of 4.6.
It should be mentioned that 'better rating first' sort is not the default sort. They use a 'relevancy' sort as default, the specifics of which I am unsure of.