Two expressions of Entropy

73 Views Asked by At

Given two discrete random variables $X$ and $Y$ taking value in ${x_1, ..., x_n}$ and ${y_1, ..., y_m}$ respectively. We define:

$p_j = P(X = x_j), 1 \leq j \leq n$

$q_k = P(Y = y_k), 1 \leq k \leq m$

$p_{jk} = P((X = x_j) \cap (Y = y_k))$

With those notations we can express the entropy of X as: $H(X) = -\sum_{j} p_j \log(p_j)$

From the definition of the entropy we can derive two different expressions:

1.

$H(X) = -\sum_{j} \sum_{k} p_{jk} \log(p_j)$ by replacing $p_j$ by $\sum_{k} p_{jk}$

2.

$H(X) = 1 \times (-\sum_{j} p_j \log(p_j))$

$H(X) = (\sum_{k} q_{k}) (-\sum_{j} p_j \log(p_j))$

$H(X) = -\sum_{j} \sum_{k} p_j q_k \log(p_j)$

Thus, I arrive at two similar expressions of entropy for X but I guess one has to be false. Where is the error in the reasoning?

Thanks for your help!

1

There are 1 best solutions below

0
On

The first expression is

$$H(X) =-\sum_{j} \sum_{k} p_{jk} \log(p_j) =-\sum_{j} (\sum_{k} p_{jk}) \log(p_j) =-\sum_{j} p_{j} \log(p_j) $$

The second is

$$ H(X)= -\sum_{j} \sum_{k} p_j q_k \log(p_j) = -\sum_{j} p_j (\sum_{k} q_k) \log(p_j) = -\sum_{j} p_{j} \log(p_j) $$

Hence, yes, both are equivalent (and correct).