For $0 < u < 1$,
$F_X^{-1}(u) = \sup \{x \in \mathbb{R} : F_X(x) \leq u \}$
$F_X^{-1}(u) = \inf \{x \in \mathbb{R} : F_X(x) \geq u \}$
are they mathematically equivalent? In most places, I have seen the second one. Any reason why?
For $0 < u < 1$,
$F_X^{-1}(u) = \sup \{x \in \mathbb{R} : F_X(x) \leq u \}$
$F_X^{-1}(u) = \inf \{x \in \mathbb{R} : F_X(x) \geq u \}$
are they mathematically equivalent? In most places, I have seen the second one. Any reason why?
If $F_X=u$ at exactly one point then the two agree (you get the unique solution to $F_X=u$).
If $F_X$ is never equal to $u$ at all, then they agree (you get the position of the jump that passes through $u$, which is a unique point).
A difference occurs when $F_X=u$ at more than one point. Then the first one gives you $\sup \{ x : F_X(x)=u \}$ while the second one gives you the inf. Now the main thing you want to do with the quantile function is to perform the probability integral transformation to use a U(0,1) generator to sample from the distribution whose CDF is $F_X$. For that purpose, this discrepancy does no harm: the set of all such $u$'s is always at most countable, because of the usual argument that a monotone function has at most countably many discontinuities (each discontinuity causes its range to fail to contain a new rational number).
For an example, if you have a Bernoulli(p) variable then the first definition of the quantile function maps $p$ to $1$ while the second maps $p$ to $0$. This changes whether the quantile function is right-continuous (first case) or left-continuous (second case).
The "elegance" of the second definition is in that $F_X^{-1}(p) \leq x$ if and only if $F_X(x) \leq p$. The first definition lacks this property when the preimage of $p$ contains more than one point.