Difference between unnormalized and normalized probability.

1.5k Views Asked by At

I have an equation : e^y, which is called unnormalized probability. And another equation : e^y/sum(e^y), which is called normalized probability. I am not getting the difference between the terms they have used. Like what they mean.

Another thing is that, how log(equation) and just equation varies. Like what inference does log(equation) gives?

1

There are 1 best solutions below

3
On BEST ANSWER

The probability of all events should sum up to 1 --- this is essentially what normalized means (in probability theory). In the example that you have given, if we sum over all possible events $y$ we can't in general say that it is one, in other words, $\sum e^y \neq 1$ in general.

To make this a bit more clear let $y_1,y_2, \dots, y_n$ be all the possible events that $y$ can take, and let the "probability" of each of these values be $e^{y_1}, e^{y_2}, \dots, e^{y_n}$. I've put "probability" in quotation because we know in general $\sum_{i=1}^{n} e^{y_i}$ is not 1. As a simple example to show this if we take $n=2$ and let $y_1=1,y_2=2$ then we know, $$\sum_{i}e^{y_i}=e^1+e^2 \neq 1.$$ So we say that $e^y$ is not normalized because it doesn't sum up to 1. However, we can use a simple trick to "normalize" it. The usual trick in normalizing quantities (and this applies to a lot of other things than just probabilities) is to divide by the sum of all possible values. Going back to my simple $n=2$ example if for each event $y_i, i=1,2$ I assign the probabilities $e^{y_i}/\sum_i e^{y_i}$ then they become normalized. Trivially the sum of the probabilities are $$ \frac{e^{y_1}}{\sum_{i}e^{y_i}} + \frac{e^{y_2}}{\sum_{i}e^{y_i}} = \frac{e^{y_1}+e^{y_2}}{\sum_{i}e^{y_i}} = 1.$$ Notice that the probabilities are the same as I had before the only difference is that I've divided each of them by the total. Since they sum to 1, we say that $e^{y}/\sum e^{y}$ is normalized. The notation you've used isn't very easy to understand --- it's better to say $$ \frac{e^{y_j}}{\sum_{i} e^{y_i}},$$ to highlight that $y$ takes different values.