I am trying to understand Example 4.16 from the book Introduction to Probability by Dimitri Bertsekas and John Tsitsiklis:
We are given a biased coin and we are told that because of manufacturing defects, the probability of heads, denoted by $Y$, is itself random, with a known distribution over the interval $[0, 1]$. We toss the coin a fixed number $n$ of times, and we let $X$ be the number of heads obtained. Then, for any $y \in [0, 1 ]$, we have $\mathbb{E} [X | Y = y ] = ny$, so $\mathbb{E} [X | Y]$ is the random variable $nY$.
Why is it necessary to condition on $Y$ here? Why can't we just write $\mathbb{E}[X] = nY$? My reasoning is that even if we don't know the value of $Y$, we can still write $\mathbb{E}[X]$ as a function of the random variable $Y$. I am not able to understand the flaw in my reasoning. I would be grateful if anyone could help me with that.