Let $S_n = \{ (x_i, y_i)\}^n_{i=1}$ be a data set. Let $\mathcal H$ and $\mathcal H' $ be hypothesis classes, such that $\mathcal H' = \{ h \in \mathcal H: \hat {er}_{S_n}(h) \leq \beta\}$, where $\hat {er}_{S_n}(h) = \sum^n_{i=1}\mathcal I[y_i \neq h(x_i)]$ is the empirical error of $h$ on $S_n$ and $\beta \geq 0$.
I need to prove that $$ 3 \sqrt {\frac{\log (2 |\mathcal H'(S_n)|)}{n}} = \mathcal O(\sqrt {\beta \frac{\text{VC}(\mathcal H) \log (\frac{en}{\text{VC}(\mathcal H)}))}{n}}), $$ where $|\mathcal H'(S_n)| = \{ (h(x_1), \dots, h(x_n)): h \in \mathcal H'\}$, and VC refers to the VC-dimension.
This is my attempt using Sauer's lemma: $$ 3 \sqrt {\frac{\log (2 |\mathcal H'(S_n)|)}{n}} \leq 3 \sqrt {\frac{\log (2) +\log (|\mathcal H(S_n)|)}{n}}\leq 3\sqrt {\frac{\log 2 + \text{VC}(\mathcal H)\frac{en}{\text{VC}(\mathcal H)}}{n}}.$$
How can I make $\beta$ appear?