Encoding the answers to questions somewhere in a binary tree

80 Views Asked by At

I have a sequence of binary questions $(U_1,\dots, U_N)$ with some distribution. I know the answer to $n\leq N$ (mod-)adjacent questions, and want to convey this knowledge with as few bits as possible.

If I do a large number of trials, is there a way to send negligibly more than the amount of information in the answers on average?

If so, what is this method called? I have been thinking about Huffman codes and variable length codes, but am not sure where to turn.


$$\mathbf{Example}$$

Say $N=4, n=2$. We are interested in a vector $(U_1,U_2,U_3,U_4)$ that has some known distribution. With equal probability we know one of the following sets: $$A_1=\{U_1,U_4\}, \ A_2=\{U_1,U_2\},\ A_3=\{U_2,U_3\}, \ \mathrm{or} \ A_4=\{U_3,U_4\}.$$ Say we happen to know $A_1$, in particular that $U_1=1, \ U_4=0$. How do we convey $(1,?,?,0)$ in as few bits as possible?

1

There are 1 best solutions below

0
On BEST ANSWER

We can apply a trinary Huffman code. If this is not clear enough, here is how to do it for the example given in the question:

We have a sequence of $N=4$ "binary" questions $(U_1,U_2,U_3,U_4).$ I put binary in quotes because we can answer yes, no, or I don't know. Take $0$ as no, $1$ as yes, and $e$ as erased or I don't know.

As in the question statement, we know the answers to everything in one of the following sub-vectors: $$A_1=(U_1,U_4),\ A_2=(U_1,U_2),\ A_3=(U_2,U_3),\ A_4=(U_3,U_4).$$ (We also know which sub-vector we know.)

Form a full trinary tree with $N=4$ levels, where at each node in level $i$ we ask "What is the value of $U_i$?", and 3 branches, $0,1,e$ coming out. The leaves at the end represent symbols in our Huffman code. Now we need to weight them.

Because we are guaranteed to only know one of $A_1,A_2,A_3,A_4$, we can delete (or 0-weight) much of the tree. In fact, we can zero weight any path not leading up to a leaf that looks like one of the following:

  • $(b_1,e,e,b_2)$, corr. to $A_1$
  • $(b_1,b_2,e,e)$, corr. to $A_2$
  • $(e,b_1,b_2,e)$, corr. to $A_3$
  • $(e,e,b_1,b_2)$, corr. to $A_4$

where $b_1,b_2$ are $0$ or $1$. Weight the remaining edges using the joint distribution of $(k,U_1,\dots,U_4)$, where $k$ is the index of the subvector you know.

We have just constructed a trinary Huffman code whose entropy we can find from the probability tree.

The same idea generalizes to arbitrary $N,n,$ and any collection of sub-vectors $A$. As $N,n$ become large (say, by running multiple trials) the Huffman code gets longer and the overhead incurred by it becomes arbitrarily small.