Most textbooks don't assert that the distributions of components of the output sequence $Y_{1},Y_{2},..Y_{n}$ of the discrete memoryless channel(DMC) are independent with each other even if the components in input sequence $X_{1},X_{2},...,X_{n}$ are independent with each other.
For example, we can not say $Y_{1},Y_{2}$ are independent with each other, even if the input $X_{1},X_{2}$ are independent with each other for a DMC. But where does the mistake lie in the following proof?
$p(y_{1},y_{2})=\sum_{x_{1}}\sum_{x_{2}}p(y_{1},y_{2},x_{1},x_{2})$ $= \sum_{x_{1}}\sum_{x_{2}}p(x_{1},x_{2})p(y_{1},y_{2}|x_{1},x_{2})$ $=\sum_{x_{1}}\sum_{x_{2}}p(x_{1})p(x_{2})p(y_{1}|x_{1})p(y_{2}|x_{2})$ $=p(y_{1})p(y_{2})$
There is no mistake.
Memoryless means that each output in a given time is only dependent on the input at that time. This implies that the conditional probabilities of the outputs given inputs can be written as $$p_{Y_1\cdots Y_n|X_1 \cdots X_n}(y_1\cdots y_n|x_1 \cdots x_n)=p_{Y_1|X_1}(y_1|x_1)p_{Y_2|X_2}(y_2|x_2)\cdots p_{Y_n|X_n}(y_n|x_n)$$
The given equation is resulted from the above fact as well as the law of total probability.