How do I calculate the accuracy of answers to my number based quiz?

670 Views Asked by At

I'm making a number based quiz. All answers are integers, and participant are given a score between 0 and 100 based on accuracy. If you're far off the mark, you should get zero points, if you are correct, you get 100 points.

(Sidenote: I tried a version where the goal is to get the least amount of negative points, and the score was simply the absolute difference between the correct and given answer. According to my family, this is "not fun" and "weird".)

So, I'm looking for a function that takes to arguments, the correct answer and the guessed value.

Consider the question: what is the temperature in Celsius on the surface of the Sun? The correct answer is 5505C. If the user guesses 5505, they get 100 points, and the further away from 5505 their answer is, the less points the should get.

So far I have tried the following function f, where g is the guessed answer, and c is the correct answer.

 f(g,c) = (1-(|g-c|)/c)*100

This has a number of problems. Consider the cases where the correct answer is negative (like the absolute zero temperature, -273), or when the answer is zero (which year was Jesus born?).

(And it might be out of scope, but consider the case where the correct answer is a year. I think this should affect scoring somehow, you would expect better accuracy the closer the correct answer is to the current year. E.g. most people could be expected to know within a decade when the Soviet Union fell, but not so for when the siege of Carthage occured.)

(Also, I'm not sure which fields of math this touches on. But I suppose it can be considered a geometry problem as I'm trying to figure out the "distance" between two points.)

3

There are 3 best solutions below

1
On BEST ANSWER

Without more specifications, I cannot give you a particularly good answer for your exact needs, but I will at least outline my interpretation of a scoring function, and how my own specifications dictated my choice, as an example that you can either take or adapt yourself.

What I am personally looking for:

  • the scoring rewards educated guesses, not happening to know the exact answer by trivia recall
  • the scoring can handle all kinds of questions with numerical answers, as long as particular parameters are given
  • after many rounds, the person with the most consistent educated estimates should win

Base Function

Let's say the answer to a question is $100$. I want the person who guessed $96$ to have roughly the same score as the person who guessed exactly $100$ because if they managed to get $96$ by an educated guess, they got pretty close and should be rewarded. If we draw out this kind of function, it should look something like this:

bell curve

For me, this screams "bell curve", so I will be using this as the basis of my scoring function.

Our base scoring function is $$ f(d) = 100e^{-s^2d^2} $$

where $d$ is the difference between the guessed answer and correct answer. The $s$ variable is a strictness factor. The higher $s$ is, the closer you have to be to the actual answer to get the same amount of points. We square $s$ because this makes it proportional to the reciprocal of the standard deviation, which makes the parameter feel a bit more intuitive. For example, the jump from $s=4$ to $s=7$ should not be the same "strictness" jump from $s=7$ to $s=50$.

This leads into our first question case:


Uniform Distribution

Let's say that the possible answers to a particular question range from $0$ to $100$, and there is a perception that any of those answers are equally likely. For example, "How old is Mary Jane?" Without knowing any specifics and randomly guessing, any integer is equally likely to be the answer. These are questions with answers we will describe as "uniformly distributed".

Our strictness depends on how big the range of values is. If our answers are from $0$ and $100$, then it is reasonable to guess within $10$ of the correct answer. If our answers are from $0$ and $1,000,000$, then it is unreasonable to guess within $10$ of the correct answer. Therefore, we need to scale our strictness depending on how big our answer range is. We will call this range $r$.

Uniform score function: $$ f(x) = 100e^{\left(-\left(\frac{40s^2}{r^{2}}\right)\left(x-c\right)^{2}\right)} $$ where $x$ is the guess, $c$ is the correct answer, and $r$ is the range of answers (or at least a decent approximation of the range if it is unbounded on both ends).

The $(x-c)$ is just a translation that puts it on top of the correct answer. Dividing the strictness by a factor of $r^2$ makes sure that the scaling is properly adjusted to the size of the range. The $40$ factor is very arbitrary, and is used here solely to put the strictness value in line with the next scoring function I am about to use.


Logarithmic Distribution

What if the question was such that even the magnitude of the answer is difficult to determine? For example, "How many atoms are there in the universe?" Essentially, we have a starting point, a minimum amount for an answer, but the sky's the limit when guessing the proper answer here.

We want to say that being under by $5$ magnitudes should be given the same score as being over by $5$ magnitudes. If the answer is $10^{40}$, then $10^{35}$ and $10^{45}$ should be given the same score. We update our scoring function by taking the difference of magnitudes rather than the difference of the pure values. This is why we will call the answers to these kinds of questions "logarithmically distributed".

We get something along the lines of: $$ f(x) = 100e^{\left(-s^2\left(\ln\left(c-m\right)-\ln\left(x-m\right)\right)^{2}\right)} $$

where $x$ is the guess, $c$ is the correct answer, and $m$ is the minimum possible answer.

However, there's a major problem with this function. If we are talking about an answer on the scale of $10^6$, it may be reasonable to expect the answer to be off by one magnitude (between $10^5$ and $10^7$). However, if our answer is in the range of $10^{100}$, then it may be unreasonable to expect the answer to be only off by one magnitude (between $10^{99}$ and $10^{101}$). We expect that if the magnitude of the answer is $p$, then the amount of magnitudes the answer is off by to get the same score should be proportional to $p$.

Logarithmic score function: $$ f(x) = 100e^{\left(-\left(\frac{10s^{2}}{\left(c-m\right)^{0.1}}\right)\left(\ln\left(c-m\right)-\ln\left(x-m\right)\right)^{2}\right)} $$

where $x$ is the guess, $c$ is the correct answer, and $m$ is the minimum possible answer.

We divide by $(c-m)^{\frac{1}{10}}$ to get the proper variance in magnitude we expect from our answer. The factor of $10$ is once again arbitrary, but now we can use the same strictness factor $s$ on both of our scoring functions and expect roughly the same score for the same strength of guesses.

If the answer does not have a minimum, but rather only has a maximum, it is rather trivial to just mirror flip the function.


Summary

We have achieved our goals with our scoring functions, while being able to tune the strictness of the score, and being able to apply our function to any question with a numerical answer, regardless of whether it is a uniformly distributed answer or a logarithmically distributed answer, and we are also able to translate the function regardless of where the answer range begins, even if it is negative.

Uniformly distributed answer: $$ f(x) = 100e^{\left(-\left(\frac{40s^2}{r^{2}}\right)\left(x-c\right)^{2}\right)} $$ where $x$ is the guess, $c$ is the correct answer, and $r$ is the range of answers (or at least a decent approximation of the range if it is unbounded on both ends).

Logarithmically distributed answer: $$ f(x) = 100e^{\left(-\left(\frac{10s^{2}}{\left(c-m\right)^{0.1}}\right)\left(\ln\left(c-m\right)-\ln\left(x-m\right)\right)^{2}\right)} $$

where $x$ is the guess, $c$ is the correct answer, and $m$ is the minimum possible answer.

A specific range of $s$ I would recommend is from $1$ to $5$. If this is more of a family game and you would like everyone to get a decent chunk of points with each question, lean towards $1$. If you want the answers to be more strict and the game to be more competitive, I would lean towards $5$.

If you don't like decimals, you could always just treat each of these with a floor/ceiling function. You can play around with these functions here.

0
On

The fact that you use a function, makes it seem a computer program is in use. Therefor you could use something like an absolute hyperbolic tangent to bound your points needly, without worrying about negative or zero values for c. This way, you could also easily make an accuracy component, which adresses your latter point. It could look something like this:

$$ score = 100\,(1-|\tanh((g-c)\cdot a)|) $$

For you to work out what the correct values of the accuracy a has to be to make the scores make sense.

In the case that you don't use a computer, then you are going far beyond what regular people would do with pen and paper when playing a quiz game.

0
On

I would suggest that having one function for all questions is a mistake. If you ask the temperature of the sun, nobody should be expected to get the value you have in mind, as there are many ways to define it. If you ask what Fahrenheit temperature water freezes at, any answer other than $32$ is badly wrong. If you ask the population of the world, it is changing with time. You need to think about what is close on each question.