Calculating amount of information gained by getting partial data

30 Views Asked by At

Set up: Let's say I have a bag with 12 notes containing strings of text made up of the letters A-L.

Action: Someone pulls one note out from that hat, then at random they choose 2 letters written on their note, and tell those letters to me.

Goal: I'm trying to calculate the average amount of information I get by knowing any 2 letters off their note.

Case study:

To make it super flushed out, let's assume these are the notes in the hat:

  1. ACEFHJL
  2. ABDFGIK
  3. BCEGHJL
  4. ACDFHIK
  5. BDEGIJL
  6. ACEFHJK
  7. BDFGIKL
  8. ACEGHJL
  9. ABDFHIK
  10. BCEGIJL
  11. ACDFHJK
  12. BDEGIKL

Example 1:

If for instance, this person tells me their note contains the letters A and B, I know they are either holding note number 2, or note number 9, since those are the two notes containing both A and B. This therefore, cuts the space of possibilities from 12 down to 2, equivalent to about 2.58 bits of information.

Example 2:

If on the other hand, they tell me their note contains the letters A and F, it would mean that the note they are holding is number 1, 2, 4, 6, 9, or 11, since they all contain both A and F. This in turn only cuts the space in possibilities in half, thus providing only 1 bit of information.

Recap:

So, we see that knowing some 2-letter combinations (like AB) provide more information than others (like AF). I would like calculate the average amount of information I get by knowing any 2 letters? That is to say, if we repeated this exercise a million times, each time, this person would shuffle the notes in the hat and pull out one. Then at random tell me 2 letters off of that note. How much information will I gain on average?

Flushing it all out:

Here's another example. Let's say this time I have a bag with the following notes in it:

  1. ACEG
  2. ABDF
  3. BCEG
  4. ACDF
  5. BDEG
  6. ACEF
  7. BDFG

Below is a table with the information gained by knowing the selected note contains any 2 letters

Letter combo p(E) I(E) Appears in:
AB 0.14 (1/7) 2.84 2
AC 0.43 (3/7) 1.22 1, 4, 6
AD 0.29 (2/7) 1.79 2, 4
AE 0.29 (2/7) 1.79 1, 6
AF 0.43 (3/7) 1.22 2, 4, 6
AG 0.14 (1/7) 2.84 1
BC 0.14 (1/7) 2.84 3
BD 0.43 (3/7) 1.22 2, 5, 7
BE 0.29 (2/7) 1.79 3, 5
BF 0.29 (2/7) 1.79 2, 7
BG 0.43 (3/7) 1.22 3, 5, 7
CD 0.14 (1/7) 2.84 4,
CE 0.43 (3/7) 1.22 1, 3, 6
CF 0.29 (2/7) 1.79 4, 6
CG 0.29 (2/7) 1.79 1, 3
DE 0.14 (1/7) 2.84 5
DF 0.43 (3/7) 1.22 2, 4, 7
DG 0.29 (2/7) 1.79 5, 7
EF 0.14 (1/7) 2.84 6
EG 0.43 (3/7) 1.22 1, 3, 5
FG 0.14 (1/7) 2.84 7

How do I compute the average information gained by selecting a random row from this table?