Scoring Items basis their Rating Distribution

62 Views Asked by At

Each item has multiple ratings in [1, 5] with the distribution as [N1, N2, N3, N4, N5] and a review count as RC.

N1, N2, N3, N4, N5 are known and are respectively equal to a number of 1-ratings, 2-ratings, 3-ratings, 4-ratings and5-ratings.

How to define the ranking system basis rating distribution for say n items?

Few things to consider:

  1. 1 and 2 rating usually means a customer didn't like the item at all.

So, a weighed average sum would look like this: (-N1*R1 - N2*R2 + N3*R3 + N4*R4 + N5*R5) / (N1+N2+N3+N4+N5)

  1. Let's take RC as an amplifying factor, on the final score.

So, what should be our mathematical approach to find R1, R2, R3, R4, R5 based on customer data, such that we are able to clearly rank products and give the right importance to each rating?

Example:

Item1: {3: 4} => N3 = 4

Item2: {4: 3} => N4 = 3

So, if we keep R3=3 and R4=4, then Item1 and Item2 will have equal scores. But, if we give R4 any value greater than 4, then Item2 will have a higher score than Item1. So, how to find from customer data how much greater 4 is from 3?

Also, the issue with weighed-sum-average is a popular item with say 10 5-ratings would be equal to a new item with say 1 5-rating if we take R1 to be 1 and R5 to be 5.