Are these two forms of Hoeffding's inequality equivalent?
I am reading this https://arxiv.org/abs/1606.04838 and in the paper it mentions
However, I have not seen this form of Hoeffding's inequality and I checked Shai Ben-David's textbook, the form is as such

I cannot reconcile these two forms. Are they equivalent? If so how to prove it, if not, why there are two forms?


Let's start by rewriting the second variant of Hoeffding's Inequality in your post (Lemma B.6) to use the same symbols as the symbols used in the first variant for the empirical and expected risks:
\begin{equation} \mathbb{P} \left[ \left| R_n(h) - R(h) \right| > \epsilon \right] \leq 2 \exp \left( -2n \epsilon^2 / (b - a)^2 \right). \end{equation}
We also substituted $m$ with $n$ (to denote the number of observations).
Note that, in the Lemma from the textbook, $a$ and $b$ are the minimum and maximum values, respectively, for individual observations. In the paper you're reading, they're talking about (mis)classification probabilities, so we can take $a = 0$ and $b = 1$ (classification probabilities, or really just probabilities, will never lie outside that range). This gives
\begin{equation} \mathbb{P} \left[ \left| R_n(h) - R(h) \right| > \epsilon \right] \leq 2 \exp \left( -2n \epsilon^2 \right), \end{equation}
because the denominator $(b - a)^2 = (1 - 0)^2 = 1$ drops off.
Intuitively, we can read this inequality as saying "the probability that the absolute difference between the two risk quantities is greater than $\epsilon$ is less than or equal to the right-hand side". To make it more similar to the inequality from the paper, we'll try to rephrase this to something more like "with probability $x$, the absolute difference is less than or equal to some quantity". We can start doing this as follows:
\begin{equation} \mathbb{P} \left[ \left| R_n(h) - R(h) \right| \leq \epsilon \right] \geq 1 - 2 \exp \left( -2n \epsilon^2 \right) \end{equation}
This now says "with probability greater than or equal to $1 - 2 \exp \left( -2n \epsilon^2 \right)$, the absolute difference is less than or equal to $\epsilon$":
\begin{equation} \left| R_n(h) - R(h) \right| \leq \epsilon \qquad \text{w.p. } \geq 1 - 2 \exp \left( -2n \epsilon^2 \right) \end{equation}
Now suppose we choose to redefine $\epsilon$ as:
\begin{equation} \epsilon \doteq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \end{equation}
This then leads to:
\begin{equation} \left| R_n(h) - R(h) \right| \leq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \qquad \text{w.p. } \geq 1 - 2 \exp \left( -2n \left( \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \right)^2 \right) \end{equation}
\begin{equation} \left| R_n(h) - R(h) \right| \leq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \qquad \text{w.p. } \geq 1 - 2 \exp \left( -2n \frac{1}{2n} \log \left( \frac{2}{\eta} \right) \right) \end{equation}
\begin{equation} \left| R_n(h) - R(h) \right| \leq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \qquad \text{w.p. } \geq 1 - 2 \exp \left( - \log \left( \frac{2}{\eta} \right) \right) \end{equation}
\begin{equation} \left| R_n(h) - R(h) \right| \leq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \qquad \text{w.p. } \geq 1 - \eta \end{equation}
In the left-hand side, the order of the subtraction of risk quantities can safely be switched around because we take the absolute value of it afterwards:
\begin{equation} \left| R(h) - R_n(h) \right| \leq \sqrt{\frac{1}{2n} \log \left( \frac{2}{\eta} \right)} \qquad \text{w.p. } \geq 1 - \eta \end{equation}
This final inequality is precisely what is given in the paper, so the conclusion is yes; they are equivalent given the context that the observations are (misclassification) probabilities, meaning that $a = 0$ and $b = 1$.