Given distributions $P_{X, Y}$ and $Q_{Y, U}$ over the respective finite alphabets $\mathcal{X}, \mathcal{Y}, \mathcal{U}$ and a positive constant $\rho$, consider the following maximization problem:
\begin{equation} \max_{Q_{X \mid Y, U}} \left[ \rho H(X \mid U) - D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) \right] \end{equation}
The conditional Shannon entropy $H(X \mid U)$ is computed with respect to the distribution $Q_{Y, U} \cdot Q_{X \mid Y, U}$ and
\begin{equation} D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) = \sum_{(x, y, u) \in \mathcal{X} \times \mathcal{Y} \times \mathcal{U}} (Q_{Y, U} \cdot Q_{X \mid Y, U})(x, y, u) \log \frac{Q_{X \mid Y, U}(x \mid y, u)}{P_{X \mid Y}(x \mid y)}. \end{equation}
My question is: Do we suffer no loss in generality in assuming that $Q_{X \mid Y, U} = Q_{X \mid Y}$? In other words, does the maximization over Markov chains $Q_{Y, U} \cdot Q_{X \mid Y}$ perform equally well?
\begin{equation} \max_{Q_{X \mid Y, U}} \left[ \rho H(X \mid U) - D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) \right] \stackrel{?}{=} \max_{Q_{X \mid Y, U} : Q_{X \mid Y, U} = Q_{X \mid Y}} \left[ \rho H(X \mid U) - D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) \right] \end{equation}
A few observations:
- $\max_{Q_X} \left[ \rho H(X) - D(Q_X \mid\mid P_X) \right] = \rho H_\frac{1}{1 + \rho}(X)$ where $H(X)$ is computed with respect to $Q_X$ and $H_\frac{1}{1 + \rho}(X)$ is the Rényi Entropy of order $\frac{1}{1 + \rho}$, computed with respect to $P_X$.
- $D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) \geq D(Q_{X \mid Y} \mid\mid P_{X \mid Y})$ with equality iff $Q_{X \mid Y} = Q_{X \mid Y, U}$.
- If, for some given distribution $Q_{Y, U} \cdot Q_{X \mid Y, U}$, we recover a distribution $R_{X, Y, U} = Q_{Y, U} \cdot Q_{X \mid Y}$, we can in general not conclude that the following holds ($H(X \mid U)$ on the left-hand side is computed with respect to $R$, on the right-hand side with respect to $Q$):
\begin{equation} \rho H(X \mid U) - D(R_{X \mid Y} \mid\mid P_{X \mid Y}) \geq \rho H(X \mid U) - D(Q_{X \mid Y, U} \mid\mid P_{X \mid Y}) \end{equation}