Mathematical derivation from Deep learning book

69 Views Asked by At

In https://www.deeplearningbook.org/contents/mlp.html p. 179 this derivation is made but lacks details that I need to understand it:

$P(y)= \frac{exp(yz)}{\sum_{y’=0}^{1}exp(y’z)} = \sigma((2y-1)z)$

Where $\sigma(x) = \frac{1}{1+exp(-x)} = \frac{exp(x)}{1+exp(x)}$

I just cannot see how they “magically” make this $(2y-1)z$ appear… please can someone kindly show me all the details of this derivation? I tried things /tricks like multiplying by fractions equal to 1 such as $exp(…)/exp(…)$ etc. but cannot find…

P.s. I think this expression can be developed as:

$ \frac{exp(yz)}{\sum_{y’=0}^{1}exp(y’z)} = \frac{exp(yz)}{exp(0*z)+exp(1*z)}= \frac{exp(yz)}{1+exp(z)}$

However, I do not see how one can play with this last expression to make it equal to $\sigma((2y-1)z)$

1

There are 1 best solutions below

0
On

You only need to write things... $$ P(y=0) = \frac{1}{1+\exp(z)}= \sigma(-z) $$ $$ P(y=1) = 1-\sigma(-z) = \sigma(z) $$ So you can write 'magically' $P(y)=\sigma((2y-1)z)$ for $y=0$ or $1$.