Represent $f(x) - f(y)$ as an integral

599 Views Asked by At

Description

I've come across the following transition in a textbook of Convex Optimisation. I couldn't figure out what's going on so that I'd appreciate if anyone hits me with any hint!

Problem

Suppose $x, y \in \mathbb{R}^n$ and $f$ be a $\beta$-smooth convex function on $\mathbb{R}^n$;

Then the transition of interest goes as

$$f(x) - f(y) = \int^1_0 \nabla f(y + t(x - y))^{T} (x - y) dt $$

Additional Comment

This transition appears in the proof of the convergence rate of Gradient Descent for a smooth convex objective function $f$.

2

There are 2 best solutions below

2
On BEST ANSWER

This is just the fundamental theorem of line integrals, which says for (sufficiently smooth) functions $f$, $$f(x) - f(y) = \int_{C} \overrightarrow{\nabla f(r)} \cdot \vec{dr}$$ where $x$ and $y$ are the endpoints of $C$. Think of it as a generalization of the fundamental theorem calculus to higher dimensions, which roughly says that the difference in a function evaluated at two points $x$ and $y$ can be calculated as the integral of its derivative over the range $[x, y]$. Here, the derivative is the gradient (since we have multiple variables now).

For your problem, $\beta$-smoothness is more than sufficient to use the theorem above, so just let $r(t) = y + t(x - y)$ be a parameterization of a straight line $L$ from $x$ to $y$ such that $r(0) = x$ and $r(1) = y$. Then from the fundamental theorem for line integrals it follows that \begin{align*} f(x) - f(y) &= \int_{L} \overrightarrow{\nabla f}(r) \cdot \overrightarrow{dr} = \int_0^1 \overrightarrow{\nabla f (r(t))} \cdot \overrightarrow{r'(t)} dt \\ &= \int_{0}^1 \overrightarrow{\nabla f (y + t(x - y))} \cdot \overrightarrow{(x - y)} dt \end{align*}

1
On

This is just the one variable fundamental theorem of calculus.

Let $\phi(t) = f(y+t(x-y))$.

Then $\phi(1) =\phi(0) + \int_0^1 \phi'(t)dt$, or equivalently, $f(x) = f(y) + \int_0^1 D f(y+ s(x-y)) (x-y) ds $, and since $Df(x)h = \langle \nabla f(x) h \rangle$, you have the desired result.

(The result is true for any real valued convex function defined on an open set containing the segment $[y,x]$ since it is then locally Lipschitz and differentiable ae.)