This question is from Nielsen & Chuang "Quantum Computation & Quantum Information Theory", Chapter 11, Exercise 11.7.
Find an expression for the conditional entropy H(Y|X) as a relative entropy between two probability distributions.
I have been unsuccessful in my attempts to prove this result so far, and any help would be appreciated. I'm reading thorough this book by myself, and have been stuck on this problem for quite some time now. I have not managed to find a solution on the internet for this, even after substantial amount of search.
Since $H(X|Y)=I(X;X|Y)$, one probable way to express it in the form relative entropy is $$H(X,Y)=D(p(x,x)||p(x)^2 \big| Y).$$ Note that this a conditional KL divergence and $p(x,x)$ is defined such that $p(a,a)=p(a)$ and for $a \neq b$, $p(a,b)=0$.