Does introducing penalties for getting true/false questions incorrect result in higher skill penetration (less luck/variance)?

1.6k Views Asked by At

A student is asked to answer 50 true/false questions and he would get 35 right and 15 incorrect if he had to put his best guesses for each question down.

Now, for each question he has a certain confidence of getting the problem correct and if we start imposing penalties for getting a question wrong, he can easily adapt by only answering questions with a confidence level tantamount to the relation [point penalty if wrong]/[point reward if right], and leave all the other one's blank.

Is there a way for the test designer to design a reward/penalty scheme that maximizes skill penetration and reduces the role of luck and, if so, what factors does the designer have to look out for?

Assume that the test questions have increasing difficulty in a fashion that we cover the full spectrum of confidence on the student.

EDIT: Another idea would be to let the student gamble with points. Also setting a limit on each question and a global gambling limit.

Is there any reasonable scheme to reduce the role of luck in general on true and false questions?

2

There are 2 best solutions below

1
On BEST ANSWER

To maximize skill penetration, award 1 point for each correct answer, and -1000 for each incorrect answer. Note: this is not a joke, this is how to only get answers that the student is entirely confident of.

In your example, the student's knowledge is net zero, getting the same score as random guessing. Such a student is unlikely to adapt his or her test-taking strategy to subtle rewards and penalties.

Additional commentary regarding gambling: Suppose you let each student give a weight to each question, of any real number in $[0,1]$, as well as the T/F answer. If right, the student gains that many points, if wrong, loses. Naively one might think that this would encourage students to weigh questions more if they are more confident, and less otherwise. In fact this is not true. If I am 51% confident that the answer is true, I will maximize my score by weighing it 1 rather than anything else.

Suppose instead you allow students to weight as above, but this time if they are wrong they lose DOUBLE their wager. If I am right with probability $p$ my expected score is $pw-2(1-p)w=w(3p-2)$. All this change does is move the threshhold from $p=\frac{1}{2}$ to $p=\frac{2}{3}$. If I believe that $p>\frac{2}{3}$, I should bet the maximum, otherwise I should bet 0.

Result: under all similar gambling schemes, there is never any reason to bet anything other than the maximum (or zero/leave the question blank).

0
On

Generally the default reward/penalty scheme for a true/false exam is to award 1 point for a correct answer and take away 1 point for an incorrect answer, so that the expected value for answering randomly is zero (as answering randomly is generally the lowest-valued optimal strategy for answering that one can choose, and test-makers want to make this strategy equivalent to not answering at all).

In your case, you're asking whether there's any way one can try and shift the rewards and penalties up or down so that only people with a certain confidence in their answer will bother answering the question (to maximize "skill-based answers" and minimize "luck-based answers"). In the default scheme above, any confidence above "I don't know this at all so I'm just going to guess randomly" will earn a positive reward, but the expected value for answering randomly is zero. Any scheme that maximizes "skill-based answers" is going to need to set this expected value below zero.

For true/false questions and multiple choice questions in general, the only thing you can really do is to modify the penalty for wrong answers to change the expected value of answering randomly. Depending on the confidence level you require for an answer to be a "skill-based answer", this value will be different.

In my probability class two terms ago, we formulated the "skill-based answer" model as follows: If a student is $x$% confident in his answer, he will choose the correct answer with $x$% probability and guess randomly otherwise. In your situation, you would set the expected value for this minimum confidence level to zero by imposing the appropriate penalty.

For example, if you require a 70% chance the student knows the answer and will guess randomly otherwise, then set the penalty for a wrong answer to $-5.6666666$ (i.e. -85/15), because with 70% confidence, the student believes he will answer correctly 85% of the time and incorrectly 15% of the time. If you require 99.9% confidence, set the penalty to $-1999$ (-99.95/0.05).

What you'd need to test experimentally is what required confidence level is high enough to block out the luck-based answers, but low enough to keep in the skill-based answers. I'm not sure how you'd actually go about testing that, though, as every student's strategy is different, and it is always possible that a student is confident in the wrong answer.

Keep in mind that with such a scheme, scores will be a lot lower than you'd expect (as there will be a lot of blank answers from disgruntled test-takers), and you'll need to adjust the passing grade accordingly. Also keep in mind that due to the large number of blank answers, you will also need to have much more questions just to keep the resolution of the test results above a certain level. No matter how you slice the scheme, a test where only an average of one question was answered will not have the low variance you seek.


In general, though, true/false questions with penalty schemes are not a good way to evaluate the knowledge of students in a test, because they discourage students from trying to answer a question, even though they might know the answer and not know that they know. They are also terrible at discerning a lucky guess from a correctly approached answer (and an unlucky guess from an incorrectly approached answer), no matter how you try and slice the penalties, since at the end of the day there are only three possible results (true / false / no answer).