Calculate new ratings based on averages and counts?

217 Views Asked by At

For a while, I've been scraping a website for an item's "ratings" data.

It seems to me that if I know (for any given timeframe):

  • the start and end count of submitted ratings, and,
  • the start and end average rating (to 16 decimal places)

...then I figured I'd be able to extrapolate the average of the "new" ratings, like:

$$\text{Average New Rating} = \frac{( \text{New Average} \times \text{New Count})-(\text{Old Average} \times \text{Old Count})}{(\text{New Count} - \text{Old Count})}$$

...but I'm getting strange results. Some periods give me reasonable results while others are way off, by ±1000's. Note that the possible ratings (on Google Play) are 1 to 5.

If you're curious to see the data, there's sample CSV data here.

(I'm apologize if I abused Latex... I'm new here!)

1

There are 1 best solutions below

1
On

Your formula is correct, but your data needs more careful interpretation, because the count is not monotonically increasing with time. In other words, there are records in your dataset for which the total count is less than a previous record (where records are sorted chronologically). This suggests that occasionally, when the rating data is scraped, some reviews have been removed or undone, possibly because of some behind-the-scenes algorithm being applied to prevent rating manipulation.

When the rating count decreases, the average rating over that period will be negative. This also has ramifications for a moving average or other kind of smoothed average. If for instance your average rating and count data looked like this:

$$(3.5, 1000), \\ (3.7, 1050), \\ (3.5, 1020), \\ (3.8, 995), \\ (3.9, 1000),$$

you can see how even doing a $4$-step average yields infinity for the average rating since there is no net change from the first to the last count in this list, but there has been a change in the average rating for the first and last counts. This may occur because the reviews that were introduced in the intervening time periods were not necessarily the same ones that were removed once the counts went back down. This is the essence of why your average "new" rating may occasionally be out of the $1$- to $5$-star range.

To address this problem, you would need to capture better data, specifically for those ratings that occurred between consecutive scrapings. If you cannot obtain this, the example I illustrated above demonstrates that it is not generally possible to infer the true average of "new" ratings.

To mitigate this issue, smoothing may help. One thing to do is to take averages over longer time periods, thus attenuating any volatility due to removal of old ratings.