Minimum number of bits required to encode a subset of sequences of length 200

491 Views Asked by At

The typical set $S$ is given as consisting of all sequences $x$ with at most three 1s.

Here $x \in ${0,1}$ ^{200} $ and $P(0) = 0.99$

We can use $\log_{2}|S|$ to encode such sequences. And that is the best one can do in case of equal length codewords I think.

But how to get the minimum expected no. of bits required for encoding the sequences in $S$ ?

Should I think in terms of the ceil of the entropy of the least probable one of these sequences times the number of such sequences?

1

There are 1 best solutions below

1
On BEST ANSWER

You should have specified you were interested in minimizing the expected codeword length. Lets look at arbitrary $n,$ and consider $\{0,1\}^n.$

In that case you can assign a 1 bit codeword, say $0$, to the all zero sequence which has probability ($p=P(0)$) $$ p^n, $$ In general, you can specify all weight $w$ codewords with $$ W_w = \left\lceil \log_2 \binom{n}{w}\right\rceil $$ bits for $w=1,2,3.$ You have 4 classes of typical sequences $W_0,W_1,W_2,W_3$ and you can specify which class you have with a prefix of 2 bits [note for $p=0.99$ there may be a slightly more efficient use of the prefix but I don't think so since $.99^{200}$ is already a small number]. For the all zero codeword just use the prefix 00 and no extra bits are necessary.

So there exists a good code with small expected codeword length $$ \overline{L}=2+\sum_{w=1}^3 W_w \binom{n}{w} p^{(n-w)}(1-p)^w. $$

Using $\log_2 |S|$ I get 20.347 bits are needed so 21 bits are needed.

The expectation above is approximately 9.64 bits so you can get an expected codeword length of 10+2=12 bits, if I have calculated correctly (please check).