Topology and machine learning: Is the softmax a homeomorphism?

787 Views Asked by At

softmax function is used as activation functinos nin neural networks

http://www.cs.nyu.edu/~eugenew/asr13/lecture_14.pdf

It is defined as:

Given $\mathbf{z} = \begin{bmatrix} z_1, \ldots, z_K \end{bmatrix}^T$

$\sigma_j(\mathbf{z}) = \dfrac{e^{z_j}}{\sum_{k=1}^K e^{z_k}}$ for $j = 1, ..., K.$

From graph, $\sigma_j: \mathbb{R}^k \to (0,1)$

Let $\sigma$ = column vector of $\sigma(\mathbf{z})_j$

Then $\sigma: \mathbb{R}^k \to \text{int}(\Delta^{k-1})$ (interior of $k-1$ simplex)

enter image description here

Is this function, which is obviously onto and 1-1 $\implies$ bijection between $\mathbb{R}^k$ and $\Delta^{k-1}$ also a homeomorphism?

2

There are 2 best solutions below

0
On BEST ANSWER

Contrary to what you say, this function is not injective. For instance, notice that if you add some fixed number $a$ to each $z_k$, then the numerator and denominator of $\sigma_j$ each get multiplied by $e^a$, so $\sigma(\mathbf{z})$ doesn't change. In fact, there does not exist any continuous injective map from $\mathbb{R}^k$ to the interior of $\Delta^{k-1}$, though this is difficult to prove.

0
On

Notice that if we let $y_k := z_k + 1$ then $\bf{z}$ and $\bf{y}$ map to the same point. (Check this!) So this function is not injective.