I have the following channel with $\mathcal X = \mathcal Y$:
$ p(y|x) = \begin{bmatrix} 1/2 & 1/2 & 0 \\ 0 & 1/2 & 1/2 \\ 1/2 & 0 & 1/2\end{bmatrix} $
Is it possible to decrease the capacity by adding a row to the channel matrix?
I think not, this is what I have so far,
I added the following last row (it can be any row for that matter):
$ p(y|x) = \begin{bmatrix} 1/2 & 1/2 & 0 \\ 0 & 1/2 & 1/2 \\ 1/2 & 0 & 1/2 \\ 1/3 & 1/3 & 1/3 \\ \end{bmatrix} $
I showed that if I pick $P_x = \left\{ \frac{1}{3},\frac{1}{3}, \frac{1}{3}, 0\right\}$, the capacity will not decrease for any row, meaning that by ignoring the added symbol to $\mathcal{X}$ the capacity will not be reduced.
$$ C = \max_{P_X}I(X;Y) = H(Y) - H(Y|X) \approx 0.584 $$, as it was before adding the row.
Is there a way to formulate a more theoretical / general solution, using any of the coding channel related theorems?
It might got to do with the channel's rate, that can only be bigger or equal to the rate after the row is added.
Thank you,
As I mentioned in comments, adding a row increases the input symbol set by one. You can always choose to never use the new symbol. So adding the new symbol cannot hurt channel capacity. Formally it means that the new set of joint distributions on $(X,Y)$ includes all old joint distributions on $(X,Y)$ (so what we can do with the new symbol is at least as good or better).
Generally, if $\mathcal{X}=\{x_1, ..., x_n\}$ is a finite and nonempty input symbol set, we can define $S(\mathcal{X})$ as the probability simplex on $\mathcal{X}$: $$S(\mathcal{X}) = \left\{(p(x))_{x\in \mathcal{X}}: \sum_{x\in\mathcal{X}} p(x) = 1, p(x)\geq 0 \quad \forall x \in \mathcal{X}\right\}$$
Given:
Let $\mathcal{X}$, $\mathcal{Y}$, and $\tilde{\mathcal{X}}$ be finite and nonempty sets that satisfy $\mathcal{X}\subseteq\tilde{\mathcal{X}}$.
Suppose we have given channel probabilities $$ p(y|x)=P[Y=y|X=x] \quad \forall (x,y) \in \tilde{\mathcal{X}}\times \mathcal{Y}$$
Then:
$$\sup_{p \in S(\mathcal{X})} I(X;Y) = \sup_{p \in S(\tilde{\mathcal{X}}): p(x)=0 \: \forall x \notin \mathcal{X}}I(X;Y)\leq \sup_{p \in S(\tilde{\mathcal{X}})} I(X;Y)$$
which just means that expanding the input symbol set from $\mathcal{X}$ to $\tilde{\mathcal{X}}$ cannot decrease channel capacity. The inequality trivially holds because $$ \{p \in S(\tilde{\mathcal{X}}): p(x)=0 \: \forall x \notin \mathcal{X}\}\subseteq \{p \in S(\tilde{\mathcal{X}})\}$$