How to show this equivalence between two definitions of channel capacity

Question

How to show this equivalence between two definitions of channel capacity

63 Views Asked by Bumbble Comm At 02 Apr 2026 - 8:56

Let us consider a random variable $X$ on alphabet $\mathcal{X}$ with probability distribution $p_X$ and a conditional probability distribution (channel) $W_{Y|X}$ that generates a random variable $Y$ on alphabet $\mathcal{Y}$. Let the distribution

$$q_Y(y) = \sum_{x\in\mathcal{X}}p_X(x)W_{Y|X}(y|x)$$

The capacity of a channel is given by

$$C(W) = \max_{p_X} I(X:Y)_p = \max_{p_X}\sum_{x, y \in\mathcal{X, Y}}p_X(x)W_{Y|X}(y|x)\log\frac{W_{Y|X}(y|x)}{q_Y(y)}$$

I have seen an alternate version of channel capacity given by

$$C(W) = \min_{q_Y}\max_{x\in\mathcal{X}} D(W_{Y|X}(\cdot|x)\|q_Y) = \min_{q_Y}\max_{x\in\mathcal{X}} \sum_{y\in\mathcal{Y}} W_{Y|X}(y|x)\log\frac{W_{Y|X}(y|x)}{q_Y(y)}$$

I don't really see how these two expressions are equal and would appreciate any pointers that lead in the right direction

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

To avoid confusion ($q_Y$ appeared twice in the post, while with different usages), I will use $P_X$ and $P_Y$ to denote the input and output distributions, respectively, and they are related through $$ P_{Y}(y) = \sum_{x \in \mathcal{X}} W_{Y|X}(y|x) P_X(x).\tag{*}$$ Then, the first definition can be written as $$C(W) = \max_{P_X} D(P_{X}W_{Y|X}\| P_X P_Y),$$ where $P_Y$ is related to $P_X$ through (*).

The second definition can be rewritten as $$ C(W) = \min_{Q_Y} \max_{x\in \mathcal{X}} D(W_{Y|X = x} \| Q_Y). \tag{#} $$ Note that since the maximum is taken over a collection of non-negative terms, we can rewrite $$ \max_{x\in \mathcal{X}} D(W_{Y|X = x} \| Q_Y) = \max_{P_X} \sum_{x \in \mathcal{X}} P_X(x) D(W_{Y|X = x} \| Q_Y) = \max_{P_X} D(P_X W_{Y|X} \| P_X Q_Y). $$ As a result, RHS of (#) can be rewritten as

\begin{align*} \min_{Q_Y} \max_{x\in \mathcal{X}} D(W_{Y|X = x} \| Q_Y) &= \min_{Q_Y} \max_{P_X} D(P_X W_{Y|X} \| P_X Q_Y)\\ &\overset{(\rm a)}{=} \max_{P_X} \min_{Q_Y} D(P_X W_{Y|X} \| P_X Q_Y)\\ &\overset{(\rm b)}{=} \max_{P_X} D(P_X W_{Y|X} \| P_X P_Y), \end{align*} which coincides with the first definition.

Here,

$P_Y$ in the last equation is also the output distribution associated with $P_X$ defined by (*)
step (a) comes from minimax theorem, (you may check the conditions apply here)
step (b) is due to the fact that, for any distribution $P_{X, Y}$ with marginal distributions $P_X$, $P_Y$, $$D(P_{X, Y}\| P_{X}Q_{Y}) - D(P_{X, Y}\|P_X P_Y) = D(P_Y \| Q_Y) \geq 0.$$ Therefore, $$\min_{Q_Y} D(P_{X, Y}\| P_{X}Q_{Y}) = D(P_{X, Y}\|P_X P_Y).$$

How to show this equivalence between two definitions of channel capacity

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in INFORMATION-THEORY

Trending Questions

Popular # Hahtags

Popular Questions