How can I compute $\frac{\exp(\lambda v_j)}{\sum_{i=1}^n \exp(\lambda v_i)}$ in a stable way?

101 Views Asked by At

Given an $n$ vector and $\lambda$ > 1e4 I wish to compute this sum $$\frac{ \exp( \lambda v_j)}{ \sum_{i=1}^n \exp( \lambda v_i)}$$ for a fixed $j \in \{1, \dots, n\}$.

The sum should be less than one, however if I try to compute it directly on matlab I get error because exp(\lambda) = Inf according to matlab. The entries $v_j$ are all greater or equal than $0$ and less than one.

Also is there a quick way to compute the sum on the denominator? I used tic toc and it seems that for big $n$ just this computation takes about $0.1$ second. This amount of time is a lot. For comparison, for the same $n$ an eigenvalue decomposition costs less than $0.01$ second.

Are there any good ways of computing this number in matlab?

1

There are 1 best solutions below

1
On BEST ANSWER

It depends a little bit on the specifics of the vector. One place such expressions come up is in statistical mechanics, where what you have written is the Gibbs distribution on $n$ states with energy $-v_i$ at a cold temperature proportional to $\frac{1}{\lambda}$.

In this physical situation, often one state has significantly less energy than the others. This effect gets amplified when the temperature is cold, so that for small enough temperatures, one of the terms in the sum is much larger than the others.

In this mathematical situation, it is useful (both for analysis and for numerics) to factor out the dominant term from the numerator and denominator. To do this, define $v_M=\max \{ v_1,\dots,v_n \}$ and rewrite your ratio as

$$\frac{\exp(\lambda (v_j-v_M))}{1+\sum_{i=1,i \neq M}^n \exp(\lambda(v_i-v_M))}.$$

Now if $v_M$ is quite dominant, then the denominator will be essentially $1$, but the ratio should nevertheless be estimated accurately.

If the terms are more or less comparable, then this approach is not very helpful. In this case you can instead factor out the average $v_A = \frac{1}{n} \sum_{i=1}^n v_i$ (just as I factored out $v_M$ above).

As for the speed in Matlab, you might just be running a loop instead of vectorizing. Vectorizing (in this case just doing sum(exp(lam*v))) makes a huge difference in performance in Matlab.