Confusion over the definition of "source code" in information theory

86 Views Asked by At

This is from Thomas and Cover's textbook, 2nd.

Def. A source code $C$ for a random variable $X$ is a mapping from $\mathcal{X}$, the range of $X$, to $\mathcal{D}^\star$, the set of finite length strings of symbols from a $D$-ary alphabet.

Ok. Then the example immediately following is:

$C(red) = 00$, $C(blue) = 11$ is a source code for $\mathcal{X} = \{red, blue\}$ with alphabet $\mathcal{D} = \{0, 1\}.$

  1. What is an "alphabet" and a "symbol" in this definition?
  2. Since the range of random variables are subsets of $\mathbb{R}$, then how can $C$ map from blue and red when these are not element of $\mathbb{R}$?
1

There are 1 best solutions below

6
On

What is an "alphabet" and a "symbol" in this definition?

In general, an alphabet is the repertoire (the set) of possible values.

In this case, the source can take two values $\{ red, blue\}$ , this would be $\mathcal{X}$, the source alphabet.

And the code is binary, then the symbols are $\{ 0, 1 \}$, and this is the code alphabet ($\mathcal{D}$).

Since the range of random variables are subsets of $\mathbb{R}$...

What? No. The range of a random variable can be anything.