Data Processing Inequality for Different Deterministic Channels

68 Views Asked by At

Let x be a random vector defined over a finite alphabet. Suppose I have 2 different channels, each with deterministic filters with outputs $f_1\circ f_2\circ f_1 ({\bf x})$ and $f_1\circ f_1({\bf x})$ where $f_2$ is bijective. Can we say that entropy of $H(f_1\circ f_2\circ f_1 ({\bf x}))$ is smaller than or equal to the entropy of $H(f_1 \circ f_1({\bf x}))$? Note that data processing inequality does not directly apply here, and that the above holds seems so intuitive but I couldn't prove it. For the above to hold, we might need to have some assumption on the support of $f_1$ and $f_2$. Any idea?

1

There are 1 best solutions below

0
On

It is not true in general. Consider the alphabet $\{1,2,3\}$. Assume that $f_1$ sends $1\to 2,2 \to 3, 3 \to 3$, while $f_2$ sends $1\to 2$, $2\to 1$, and $3 \to 3$.

Consider the random variable $X$ with $P(X=1)=0.5$ and $P(X=3)=0.5$. Then, $H((f_1 \circ f_1)(X)) = 0$ bits, while $H((f_1 \circ f_2 \circ f_1)(X)) = 1$ bit.