I have some difficulties with understanding the schema to prevent underflow, which is very often mentioned as The log-sum-exp trick, the partial decription is The log-sum-exp trick.
In short, I will describe the general idea.
Assume you need to calculate $w_i=\frac{\prod_{j}^{n}p_{ij}}{\sum_{i}^{n}\prod_{j}^{n}p_{ij}}$, where $p_{ij}$ might be very small value, therefore the overall product is very small and might cause underflow when calculating on computer.
Let's apply $\log$ and $\exp$ to the enumerator and denominator and $z_i=\sum_{j}^{} \log p_{ij}$, then
$w_i = \frac{e^{z_i}}{\sum_{j}^{n} e^{z_j}}$, of course, ensure that $p_i \neq 0$.
The problem with resulting formula is still $e^{z_i}$ might have small value and might cause underflow.
So far everything is ok.
Then $m = \max_i(z_i)$
$w_{i} = \frac{e^{z_i}}{\sum_{j}^{}e^{z_j}}= \frac{e^{z_i-m}}{\sum_{j}^{}e^{z_j-m}}=$
$=0\ if\ z_i-m<-k$
$=\frac{e^{z_i-m}}{\sum_{j:z_j-m \geq -k}^{}e^{z_j-m}}$, otherwise
,when $k$ is some value such that $e^{-k}$ doesn't cause underflow.
The question is why do we need to use $m$ value.
In my opinion, $w_{i} = \frac{e^{z_i}}{\sum_{j}^{}e^{z_j}}$ with the $k$ value, would work just fine, positive values of $z_j$ don't cause the problem, and the negative are filtered out by $-k$, what the reason to artificially decrease the value of $z_j$ by m.
If you decrease all $z_i$ by $m$, you essentially multiply numerator and denominator by $e^{-m}$, which leaves the fraction unchanged - except that the computations now work without underflow. More precisely, the $m$ is chosen in such a manner that at least one $z_i-m$ is $zero$ (and the others still negative), so that one $e^{z_i-m}$ is simpy $1$; all values that are still very small are ignored (chopped off by comparing with $-k$), which is of no harm as values causing underflow are negligible compared to at least the one $1$ value.