This question is most likely related to calculus/algebra and tricks regarding supremums rather than actual understanding of large deviations and the Legendre transform, but anyway. For a random variable $X_1$, we define a convex function
$$ H(\alpha) = \log M(\alpha), $$
where $M(\alpha) = E^{\alpha X_1}$ is the moment generating function for $X_1$. Let $L$ be the Legendre transform of $H(\alpha)$, i.e.
$$ L(\beta) = \sup_{\alpha} [\alpha \beta - H(\alpha)]. $$
As an example of $H(\alpha)$ and $L(\beta)$, suppose that $X_1$ is Bernoulli with $P(X_1 = 0) = 1 - p, P(X_1 = 1) = p$ for $p \in (0, 1)$. Then $H(\alpha) = \log((1 - p) + pe^{\alpha})$ and
$$ L(\beta) = \begin{cases} \beta \log(\frac{\beta}{p}) + (1 - \beta) \log(\frac{1 - \beta}{1 - p}) & \beta \in [0, 1] \\ \infty & \beta \notin [0, 1] \end{cases}. $$
My question is: while I understand how we got $H(\alpha)$ quite easily, I cannot figure out at all how we arrived at $L(\beta)$.
Note that the function $\alpha\mapsto\alpha\beta-H(\alpha)$ is maximal when the derivative $\beta-H'(\alpha)$ is zero. This allows to deduce an expression of the optimal $\alpha$, say $\alpha_\beta$, in terms of $\beta$ and to compute $L(\beta)$ as $L(\beta)=\alpha_\beta\beta-H(\alpha_\beta)$. If that derivative is never $0$, one considers the limit as $\alpha\to\pm\infty$.