Let $X$ and $Y$ be stochastic variables on respectively $n$ and $m$ points with $m>n$ and a joint probability distribution $p(x,y)$. The mutual information is $$ I(X ;Y) = H(X) + H(Y) - H(X,Y) $$ where $H(X)$ denotes the Shannon entropy of the marginal of $p$ over $X$ and $H(X,Y)$ is the Shannon entropy of the joint distribution $p$.
Is it possible to compress $Y$ to the size of $X$ whilst preserving mutual information? That is, does there exist a stochastic matrix $T: \mathbb{R}^m \rightarrow \mathbb{R}^n$ which sends $p$ to $(I_n\otimes T)p$, such that $$ I(X;Y) = I(X;Y^\prime) $$
Intuitively this makes sense as the maximal amount of information that they can share should depend on the smallest dimension of the two. I however couldn't find any result like this.
Here's my attempt at answering this interesting question.
The random variables $X,Y,T(Y)$ form a Markov chain $X\rightarrow Y \rightarrow T(Y)$. By the data processing inequality, it always holds $$ I(X;T(Y))\leq I(X;Y) $$ with equality if and only if it also holds $X\rightarrow T(Y) \rightarrow Y$.
The latter condition is what defines the so called sufficient statistic in estimation theory [Cover&Thomas, Ch. 2]. Therefore, your question may be equivalently posed as follows: Is it always possible to find a sufficient statistic $T(Y)$ of dimension smaller than the dimension of the "parameter" $X$?
It turns out that this is not always possible. Consider the following example (taken from these slides). $X\in \mathbb{R}$ is an one-dimensional random variable (of some arbitrary distribution) and $Y\in \mathbb{R}^n$ is an $m$-dimensional random variable whose elements are i.i.d. uniformly distributed over the interval $[X,X+1]$. It can be shown that the so-called minimal sufficient statistic in this case is the two-dimensional vector $(\min\{Y_i\},\max\{Y_i\})$. Therefore, although "compression" of the observation is possible, the dimension of the minimal sufficient statistic is greater than that of $X$. Since the transform $T$ is non-linear in this case, it follows that restricting our attention to linear transforms can only result in an increase of the sufficient statistics dimension.