Code for specific case [Information Theory]

Question

Code for specific case [Information Theory]

207 Views Asked by Bumbble Comm At 01 Apr 2026 - 11:22

I have a channel where the probability of sending a $0$ and receiving a $1$ is $P_e$, sending $0$ and receiving $0$ is $1-P_e$, sending $1$ receiving $0$ is $P_e$, sending $1$ receiving $1$ is $1-P_e$. Assume $P_e=0.25$.

Thing is: I have $3000$ bits to send (already compressed) and, after the COD, these bits can be coded with up to $16000$ bits. This doesn't mean that I have to take the $3000$ bits at once, I can split them and just code them by "parts" (for example, $3$bits$\to 16$bits). The goal is, of course, to minimize the probability of error.

Which kind of linear code (COD+DECOD) would you use? I just want a hint of which one and why, no a complete simulation demonstrating that the code is nearly optimal.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2013-11-25 03:41:55

There is a Hadamard code which uses 32 bits to transmit 6, which is the ratio you want. It is the dual of an extended Hamming code. It has a weight of 16, i.e. it can certainly decode a block correctly if it has 7 errors or fewer, and it can certainly detect an ambiguity if the block has 8 errors. It is not a perfect code, meaning that it can sometimes correctly decode a message that has more than 7 errors. This code is not really good enough for a bit-error rate of 0.25, but the code is fairly simple, so let's look at it anyway.

Let count_bits be a function that counts the number of ones in the binary representation of its integer argument. In Python:

def count_bits(n):
  ans = 0
  while n != 0:
    ans += 1
    n &= n-1
  return ans

We can then define the code as follows:

valid_messages = [
  [
    count_bits(i&n) & 1
    for i in range(32,64)
  ]
  for n in range(64)
]

It is a linear code. Its six basis vectors are as follows:

>>> for i in [1, 2, 4, 8, 16, 32]: print valid_messages[i]
... 
[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
[0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1]
[0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

We can easily verify the distances between the codewords of the code:

def distance(message1, message2):
  return sum([x^y for x, y in zip(message1, message2)])

for m in range(64):
  for n in range(m):
    d = distance(valid_messages[m], valid_messages[n])
    print "Distance from %d to %d is %d" % (m, n, d)
    assert d >= 16

The code is small enough that we can decode it by dumb exhaustive search:

def decode(message):
  best = []
  best_count = 1000000
  for n, candidate in enumerate(valid_messages):
    count = sum([x!=y for x, y in zip(message, candidate)])
    if count < best_count:
      best = []
      best_count = count
    if count == best_count:
      best.append(n)
  return best

Note that this function returns a list of possible answers; the list can contain more than one answer if the decoding is ambiguous.

We can now measure the performance of the code experimentally. It does not matter what message you are trying to transmit, so let's transmit the zero message. We will corrupt it at random, and then try to decode it. We will count the number of successes, the number of detected ambiguities, and the weight of the worst error that was successfully corrected.

from random import random

p_error = 0.25

NUM_TRIALS = 100000

num_successes = 0
num_ambiguous = 0
best_weight = 0
best_example = valid_messages[0]
for _ in range(NUM_TRIALS):
  message = [int(random()<p_error) for _ in range(32)]
  decodings = decode(message)
  if len(decodings) > 1:
    num_ambiguous += 1
  elif decodings == [0]:
    num_successes += 1
    weight = sum(message)
    if weight > best_weight:
      best_weight = weight
      best_example = message

The results are as follows:

About 0.666 of the messages are decoded successfully.
About 0.236 of the messages are detected as ambiguous.
The remaining 0.098 of the messages are incorrectly decoded. In such cases, the bits in the message will be scrambled, but about half of them will be right by chance.
The worst corruption that was successfully corrected has weight 10. An example is:

[0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1]

For comparison, another code that meets your bound on the rate is the repetition code of order 5. That code correctly decodes a bit with probability 0.897 (which is better) and incorrectly decodes it with probability 0.103 (which is worse), but it reports no ambiguities (which is likely to be undesirable).

The theoretical rate limit for a bit error rate of 0.25 is 0.1887... which is very close to your specification of 0.1875. It will be difficult to construct a code that meets your specification. It will certainly require a block length much longer than 32 bits; even if you use all 16000 bits the probability of an incorrect decoding will be greater than $10^{-6}$. There are $2^{3000}$ possible messages so it will probably be rather difficult to decode. The suggestions in the comments look sound to me.

Update: I have done a more accurate calculation for codes with a block length of 16000.

We want at least $2^{3000}$ distinct codewords. There are only $2^{16000}$ possible received messages, so at most $2^{13000}$ distinct received messages can decode to each codeword.

A calculation indicates that there are more than $2^{13000}$ binary strings of length 16000 that contain 4017 ones or more. Therefore, we cannot hope to make a code that can correct more than 4016 errors.

Another calculation indicates that the probability of 4016 errors or fewer is about 0.61891365. Therefore, the probability of an incorrect or ambiguous decoding is at least about 0.38108635. This applies to any error-correcting code and there is no way around it.

This illustrates how close to impossibility your specification is. In practice, we will not be able to approach this theoretical bound.

Code for specific case [Information Theory]

There are 1 best solutions below

Related Questions in CODING-THEORY

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions