Formula to generate "Score" that ranges between 0 and 100 based on many inputs

1.4k Views Asked by At

Suppose you have a set of data looking like this:

views: 1500
likes: 23
avgRating: 7
ratings: 2
purchases: 6

Note, that those numbers may vary dramatically. Now I want to create a formula, that outputs a "score" between 0 and 100 based on the given data. I also want set different "weights" for every variable, to give them an "importance factor".

P.S.: The purpose is to create a ranking system which sorts based on the score.

1

There are 1 best solutions below

1
On BEST ANSWER

Obviously, there is no end to the number of ways this might be done. Here is one idea to get you started thinking. I know almost nothing about your business and your objectives, so I suppose you will want to modify my suggestions.

Put each measure on a scale of 0 to 100. Start by putting each type of data on a scale between o and 100, where 0 is safely below the smallest value and 100 safely above the largest.

For example, maybe your data on 10,000 views has numbers that run from 150 to 3819. You know views can't be less than 0 and suppose it will stay below 5000. Then your score for views could be the number of views divided by 50.

summary(views)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    150    1659    1996    1994    2330    3819 

If you absolutely can't imagine a top value from your data, maybe take the square root of the log of views and see if you can guess a top value of the modified data.

summary(sqrt(views))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  12.25   40.73   44.68   44.29   48.27   61.80 

Maybe you could use 100 as the max, and sqrt(views) as the score.

summary(log(views))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  5.011   7.414   7.599   7.563   7.754   8.248 

Maybe you could use 10 as the max and multiply log(views) by 10 to get a score.

But use transformations with care. If you use transformations you might change symmetrical data into skewed data (as here). If you start with right-skewed data, then transformations might make scores more nearly symmetrical. Look at some plots to see the effects of any transformations.

enter image description here

Moving along, maybe you know avgRating has to be between 0 and 10, so the score for that could be found by multiplying by 10.

In this way you will get five scores $s_1, s_2, \dots, s_5.$ Over time, you may find you have to change some of these because traffic is becoming much heavier or much sparser.

Decide on weights that add to 1: Then decide how important each of these measures is. You have five of them. Weights $w_1, w_2, \dots, w_5$ should be positive fractions adding to $1$: perhaps something like .1, .25, .25, .1, .3.

Find total score: Then the total score for each item would be $$T = \sum_{i=1}^5 w_is_i = w_1s_1 + w_2s_2 + \cdots + w_5s_5,$$ which will lie between 0 and 100.