The lemma that is proved for proving Hoeffding's inequality is:
If $a\leq X\leq b$ and $E[X]=0$, $E[e^{tX}] \leq e^{\frac{t^2(b-a)^2}{8}}$ Here's a link to the proof: http://www.stat.cmu.edu/~larry/=stat705/Lecture2.pdf
There's a particular step in the proof the motivation of which I don't understand. Equation (3) in the pdf is $$Ee^{tX} \leq \frac{-a}{b-a}e^{tb}+\frac{b}{b-a}e^{ta}=e^{g(u)}$$ where $u=t(b-a)$, $g(u)=-\gamma u+\log (1-\gamma+\gamma e^u)$ and $\gamma=\frac{-a}{b-a}$.
I can understand that somebody would try to get the inequality in the form $E[e^{tX}] \leq e^{g(u)}$, but I don't see why one would choose $u=t(b-a)$. Put another way, I don't think I would have tried the above step. Is there a motivation behind why someone would think of defining $u$ the above way and get the complicated expression of $g(u)$, and hope that this will lead to something useful?
The definition $u:=t(b-a)$ naturally arises from the fact that $a\le X\le b$. $t$ is a reduced variable, independent of the interval length.
The unexpected form of $g(u)$ is a rewrite of $\log(\frac{-a}{b-a}e^{tb}+\frac{b}{b-a}e^{ta})=\log(e^{ta}(\frac{-a}{b-a}e^{t(b-a)}+\frac{b}{b-a}))$, with the intent to bound it from its Taylor's expansion.
The main "trick" in this derivation is the use of the convexity property. The rest is commonplace function approximation.