Does it pay to know what you know?

69 Views Asked by At

Let's play a game. I ask you question a yes/no question, and you answer. You don't answer with a yes or no though, you answer with a probability of it being yes ($P \in (0,1)$). For example, I might ask "does Boston have more people than Dallas" and you answer 1%. Note that this probability isn't really the probability of Boston having more people than Dallas, but the probability of you being right.

If you say $P$, the score you get is $\log_2(P)$ if the answer is yes and $\log_2(1-P)$ if the answer is no. This derives somewhat from information theory. $-\log_2(P)$ is the information you get, or surprise you get, from the event. You get a score of $--\log_2(P)=\log_2(P)$ so you minimize the surprise.

My question is, is it in fact rational to choose honestly? Namely is it rational to pick the actual probability, according to your belief level?


This actually has a very practical application. The name of the game is Credence Calibration. By getting good at it, when people ask you "how sure are you?" you can give an accurate answer. 7 out of 10 times that you say "70%", you'll be right. 99999 out of 100000 times that you say "99.999%" you'll be right. People will learn this, and will be able to learn more from you (this is the connection to information theory. You are maximizing the information you give people), especially if you train with a variety of questions (since then you'll be consistent with percentages across domains). It will also improve your own decision making.

The above paragraph is only true though if the scoring mechanism works.

1

There are 1 best solutions below

2
On

Yes

We can model a question as a random variable. Let's say it has a $R$ chance of being yes. You're expected pay off by saying $P$ is

$$R\log_2(P)+(1-R)\log_2(1-P)$$

This approaches $-\infty$ as $R$ approaches $0$ or $1$, and is continuous on the interval $(0,1)$. Therefore, we just need to find the largest local maxima.

The derivative with respect to the move $P$ is

$$\frac RP-\frac{1-R}{1-P}$$

If we solve for $0$, we get:

$$0=\frac RP-\frac{1-R}{1-P}$$ $$\frac RP=\frac{1-R}{1-P}$$ $$\frac PR=\frac{1-P}{1-R}$$ $$P-PR=R-RP \text{ (Cross multiplied)}$$ $$P=R$$

Since this is the only solution, $P=R$ maximizes the expected value.