Does anyone have a good intuitive reason why the Fenchel-Legendre Transform appears all over the theory of Large Deviations.
The Fenchel-Legendre Transform appears in both Sanov Theory and Friedlin-Wentzell Theory, from reading I see that it is appearing in these two areas because of the role that it plays in Cramer's Theorem, and the role that Cramer's Theorem plays in these two areas.
I guess the Fenchel-Legendre Transform of the $\log$ moment generating function is whats really playing a role. Maybe some insight into the $\log$ moment generating function would be of use.
$\newcommand{\l}{\lambda} \newcommand{\E}{\mathbb{E}} \newcommand{\P}{\mathbb{P}} \newcommand{\L}{\Lambda}$ Let $X$ be a random variable, let $\L(\l) = \log(\E \exp(\l X))$ be the log-moment generating function, and consider the Chernoff method, applied to $X$:
$\P(X \geq x) \leq \exp(-\l x) \cdot \E\exp(\l X) = \exp(-\l x + \L(\l))$
(Looks familiar?) Optimizing over $\l$, we get
$\P(X \geq x) \leq \exp [- \sup_{\l \geq 0}(\l x - \L(\l))] = \exp (-\L^*(x))$
where $\L^*(x) = \sup_{\l \geq 0}(\l x - \L(\l))$. This is not exactly the Fenchel-Legendre transform (which differs in $\sup_{\l \in \mathbb{R}}$), but probably explains why this form occurs in the theory. In the notes linked below you can find further motivation.
Note: This discussion is borrowed from these notes. Look at page 21.