Why am I getting exactly same expression for the affine underestimator?

94 Views Asked by At

In Boyd & Vandenberghe's Convex Optimization, it is proved that (except for a technical condition) a convex function can be represented as the pointwise supremum of a family of affine functions. To show that they consider the epigraph approach and the supporting plane for that epigraph. In their work they write that if $(z,t)\in \textbf{epi}~f$, where $f$ is a convex function, then there exist a supporting plane for point $(x,f(x))$ such that $$\left[ {\begin{array}{c}a\\b \\ \end{array} } \right]^T\left[ {\begin{array}{c} x-z\\f(x)-t \end{array} } \right]\leq 0.$$ With this they come up with the following expression for the affine underestimator of $f(z)$ $$f(z)\geq f(x)+\frac{a^T}{b}(x-z)=g(z).$$ In my attempt I start by $$\left[ {\begin{array}{c}a\\b \\ \end{array} } \right]^T\left[ {\begin{array}{c} x-z\\f(x)-t \end{array} } \right]\geq 0.$$ which can be written as $$a^T(x-z)+b(f(x)-t)\geq 0\\ a^T(x-z)+b(f(x)-f(z)-s)\geq 0$$ where $t=f(z)+s$ is assumed for $s\geq 0$. Now this expression is valid only if $b<0$ because otherwise it at least becomes less than zero at $x=z$ (is this reasoning right to say that $b<0$?). Therefore we can write, for $s=0$ case $$a^T(x-z)-|b|(f(x)-f(z))\geq 0$$ which result in $$f(z)\geq g(z)=f(x)-\frac{a^T}{|b|}(x-z)=f(x)+\frac{a^T}{b}(x-z)$$ where I used that $b<0$ in last equality. But with this my expression for $g(z)$ is exactly same as that in the book. So is it right or wrong?

1

There are 1 best solutions below

3
On BEST ANSWER

I'll assume that $a,x,z\in\mathbb R^d$, $b,t\in\mathbb R$ and $f : \mathbb R^d\to\mathbb R$.


You use $$\left[\begin{array}{c} a\\b\end{array}\right]^T\left[\begin{array}{c}x-z\\f(x)-t\end{array}\right]\ge 0$$ but notice that if you pick any negative real number $\alpha<0$, and let $a'=\alpha \times a$, $b'=\alpha\times b$, then $$\left[\begin{array}{c} a'\\b'\end{array}\right]^T\left[\begin{array}{c}x-z\\f(x)-t\end{array}\right] = \alpha \times \left[\begin{array}{c} a\\b\end{array}\right]^T\left[\begin{array}{c}x-z\\f(x)-t\end{array}\right] \le 0$$ Using your book's estimator you obtain $$f(z)\ge f(x) +\frac{(a')^T}{b'}(x-z)=f(x) +\frac{\alpha\times a^T}{\alpha\times b}(x-z)=f(x) +\frac{a^T}{b}(x-z)$$

So yes it is right. The choice of $\ge$ over $\le$ is nothing more than a choice of "direction" for $a$ and $b$, so taking the ratio eventually makes that choice a trivial one.

Also, your estimator is in essence a hyperplane that goes through point $\left[\begin{array}{c}x\\f(x)\end{array}\right]$, with normal vector $\left[\begin{array}{c}a\\b\end{array}\right]$. So it is reassuring that just flipping the opposite/scaled vector of $\left[\begin{array}{c}a\\b\end{array}\right]$ has no significance on the end result.