Show that the defined set is convex

26 Views Asked by At

Suppose we want to predict nominal data with $k$ classes. We identify each class with an integer $1,...,k$. Define $$P(y=i|z) = \frac{e^{z_i}}{\sum_{j=1}^{k} e^{z_j}}$$ where $z=Wx \in \mathbb{R}^k$ for $W\in\mathbb{R}^{k\times d}$ and $x \in \mathbb{R}^d$.

Show that the region $$\mathcal{A_i} = \{x:\mathbb{P}(y=i|Wx) \geq \mathbb{P}(y=j|Wx) \forall j\in \mathcal{Y}\}$$ is convex.

So we want to show that for $a \in \mathcal{A}_i$ and $a' \in \mathcal{A}_i$, $\lambda a + (1-\lambda)a' \in \mathcal{A}_i$.

Let $z = Wa$ and $z' = Wa'$. Then $$P(y = i |\lambda z + (1-\lambda) z') = \frac{e^{\lambda z_i + (1-\lambda)z_i'}}{\sum_{m=1}^{k} e^{\lambda z_m + (1-\lambda)z_m'}} \hspace{5mm} (1)$$ So the goal would be to show $$(1) \geq \frac{e^{\lambda z_j + (1-\lambda)z_j'}}{\sum_{m=1}^{k} e^{\lambda z_m + (1-\lambda)z_m'}} = P(y = j |\lambda z + (1-\lambda) z') \forall j \in \mathcal{Y} \hspace{5mm} (2)$$

Since $a \in \mathcal{A}_i$ and $a' \in \mathcal{A}_i$, $$P(y = i |\lambda z) \geq P(y = j |\lambda z) \forall j\in \mathcal{Y} \hspace{5mm} (\dagger)$$ and $$P(y = i |(1-\lambda) z') \geq P(y = j |(1-\lambda) z') \forall j\in \mathcal{Y} \hspace{5mm} (*)$$ We can sum these two expressions to get $$\frac{e^{\lambda z_i }}{\sum_{m=1}^{k} e^{\lambda z_m }} + \frac{e^{(1-\lambda) z_i' }}{\sum_{m=1}^{k} e^{(1-\lambda) z_m' }} \geq \frac{e^{\lambda z_j }}{\sum_{m=1}^{k} e^{\lambda z_m }} + \frac{e^{(1-\lambda) z_j' }}{\sum_{m=1}^{k} e^{(1-\lambda) z_m' }} \forall j \in \mathcal{Y} \hspace{5mm} (3)$$ However, I am struggling to relate $(2)$ and $(3)$ due to the denominator sums. I am thus questioning whether my setup is correct to solve this problem.

Edit: Is it as straightforward as knowing $z_i \geq z_j$ from $(\dagger)$ and $z_i' \geq z_j$ from $(*)$ and just completely disregard $(3)$?