data processing inequality-mutual information

314 Views Asked by At

suppose that we have a family of probability mass functions ${f_\theta }\left( x \right)$ indexed by $\theta$, and let $x$ be a sample from this distribution. Then from the information theory, we have the following relation:

$I\left( {\theta ,T\left( x \right)} \right) \le I\left( {\theta ,x} \right)$

where ${T\left( x \right)}$ is a function of the samples. The above relation says: by processing the samples, we do not get any new information. Now my question is that, is this claim aways true? For example, consider the following scenario:

Again, assume that we have a family of probability mass functions ${f_\theta }\left( x \right)$. Also, we are given a matrix that shows some correlations among x's dimensions. Now, we process the samples $x$ by this matrix to get new samples as,

$y = Ax$

where matrix $A$ is a similarity matrix indicating some correlations between $x$'s dimensions that we are given. Does $y$ contain new information in my example?