Consider two system $X$ and $Y$ described by probabilities distribution.
We define the conditionnal entropy of $X$ knowing $Y$ as :
$$S_{X|Y}=\sum_y p(y) \left( - \sum_{x} p(x|y) \log(p(x|y)) \right) $$
If I understood well, this quantity is supposed to quantify the entropy of $X$ once I know my system $Y$.
In a way I understand the definition, $-\sum_{x} p(x|y) \log(p(x|y))$ is the entropy of $X$ once I know for sure the state in which $Y$ is : $y$.
Then, I do the average on all the possible states of $Y$ doing the : $\sum_y p(y)$.
So things are confused in my mind : $S_{X|Y}$ quantifies the lack of information on $X$ once we know $Y$. But in the same time, we don't know $Y$.
I could say that this quantity thus quantify the lack of information on $X$ once we know $Y$ (which is lower than the lack of information without knowing $Y$). But as we don't know $Y$ we do the average on all possible outcomes of $Y$.
But for me it doesn't really mean anything to say this : we know $Y$ or we don't. This is why I am a little lost.
I would like to have a clear explanation of what is happening here, I think I miss a step in the reasoning. What does really mean the quantity $S_{X|Y}$.