Calculus of Variations in Probability Theory

1.8k Views Asked by At

Are there any places where the Calculus of Variations shows up (i.e. is used) in probability?

It seems like it should be natural for functional optimization to appear (as it does in statistics, where it seems to appear in somewhat more applied areas, like regression and machine learning; or its use in stochastic optimal control theory), but I do not know of any interesting probabilistic problems that can be cast in a variational light, but are more on theoretical side, or maybe proofs of theorems in probability that use variational techniques.

I suppose the Malliavin Calculus is related, but it seems more like an answer to the "reverse" question :)

1

There are 1 best solutions below

0
On

I found a neat example. It answers the question:

Given constraints on its expected values, which distribution has maximum entropy?

The answer can be shown to be the exponential family, using the variational calculus (see here).


The exponential family is the set of distributions with density: $$ f_X(x|\lambda) = h(x)g(\lambda)\exp\left(\vec{\eta}(\lambda)\cdot\vec{T}(x)\right) $$ for parameter vector $\lambda$.

Then, the following is (a slightly weaker version of) a theorem due to Boltzmann:

Suppose we consider $n$ measurable functions $\{f_j\}_{j=1}^n$ and numbers $a_j\in\mathbb{R}\;\forall\;j=1,\ldots,n$. Then among all probability distributions with support on $S$ that satisfy the constraints $$ \mathbb{E}[f_j(x)]=a_j\;\forall\;j=1,\ldots,n $$ the one with probability density $$ p(x) = c \exp\left( \sum_j \lambda_j f_j(x) \right) $$ has maximum entropy.

The proof works as follows. Define the variational functional: $$ J[p]=\int_S p(x)\ln p(x)dx - \lambda_0 \left[ \int_S p(x)dx-1 \right] - \sum_j\lambda_j\left[ \int_S f_j(x)p(x)dx - a_j \right] $$ with Lagrange multipliers $\lambda_i$. Notice term 2 forces $p$ to integrate to 1 and term 3 forces the moments to satisfy the constraints, while term 1 maximizes the negative entropy.

Then, the vanishing of the first variation $$ \frac{\delta J}{\delta p} = 0 $$ can be shown to imply $$ p(x) = c \exp\left( \sum_j \lambda_j f_j(x) \right) $$ which is an exponential family member.