I am reading this article and in its first sections the geometry of classical parameter estimation is discussed. The Fisher–Rao metric and statistical distance is introduced and an expression for this metric is given as:
$ds_{FR}^2=\sum_{j}\frac{dp^j dp^j}{p_j}$
And an explanation is given in the next paragraph
Note that the statistical distance in this equation diverges when one of the probabilities $p_j$ tends toward zero. This gives us a clue how to interpret the distance between two distributions: when the probability of one of the measurement outcomes is strictly zero, then obtaining that measurement outcome will allow us to infer with certainty that the system is governed by the other probability distribution.
A figure is given for further clarification:
with the following caption:
The distance between probability distributions $P_A$ and $P_B$ diverges when one of them $(B)$ lies on the hull of the simplex.
I have difficulty understanding what the authors are trying to convey. When one of the probabilities $p_j$ becomes zero, the expression for the Fisher-Rao metric becomes undefined (division by zero). However if we exclude zero itself and only approach zero, then it is obvious that the result will approach infinity. But I fail to see the connection of "certainty" as explained in the text with this diverging distance nor do I see any infinite distance on the probability simplex in the provided figure. In the explanation it is said that if one of the probabilities is zero then it means that the system is governed by the remaining probabilities. Fair enough, but I don't understand how this is related to the undefined or diverging distance and the provided image with its corresponding caption.
The Fisher-Rao metric is a distance measure in the space of probability distributions. As explained in the same section of the article, this metric is closely related to the Euclidean metric, for which we have an intuitive understanding. For any pair of distinct probability distributions NOT containing zero, we can find a positive real number as a quantifier for their distance and draw it on the probability simplex. When we have two probability distributions and one of them contains zero, it is clear that they are distinct so intuitively they must have a positive distance in the space of probabilities and in the picture this intuitive idea is respected since $P_A$ and $P_B$ appear to have a finite distance in the figure. However, the equation suggests that we have a diverging distance in this case and I fail to connect this fact with the intuitive picture.
What kind of distance does Fisher-Rao metric give? Can we extend its definition when we have two distinct probability distributions containing zero probability (not approaching zero but exactly zero)? How do we interpret the outcome of this metric when probabilities approach zero and what exactly authors are trying to communicate?

Using barycentric coordinates for the simplex if the point is on the hull it means one of those coordinates is zero. The diagram shows $P_B$ on the hull which means that the $p_2$ component will be zero as it can be defined entirely with respect to $p_1$ and $p_3$. This zero is why the division by zero is related to the hull. In some sense the hull of the simplex is infinity in the same way the point at infinity is constructed in stereographic projection so points on they hull are infinitely far from the interior.
By diverges the author means "does not converge to a real number" which in this case is because it approaches infinity. Divergence doesn't mean it has to approach infinity though such as the sequence $1,-2,3,-4,...$ which diverges but does not approach anything. In this context it only means diverges to infinity however.