Prove Regarding Sigmoid Function and Softmax Function

94 Views Asked by At

In Machine Learning, sigmoid function is used to maximize the likelihood. [Right?]

$$a(x) = \frac{1}{1+e^{-x}}\quad\{Sigmoid\;Function\}$$

which will give me the probability of success, now it's used when you only have two classes to classify which is equivalent to the SoftMax function

Before defining the SoftMax function, let's just say that there is some linear function that gives every a class a score based on some inputs, lets call it $L$

$$p(class[i]) = \frac{e^{L(i)}}{e^{L(1)}+e^{L(2)}+e^{L(3)}+.....+e^{L(n)}}$$

Now I'm struggling to prove that they're the same

SoftMax Function : If Linear function scores are $Z_1,Z_2,...,Z_n$, then $$p(class[i]) = \frac{e^{Z_i}}{e^{Z_1}+e^{Z_2}+.....+e^{Z_n}}$$