How to show this function is convex? is there any simpler way than Hessian computation?

229 Views Asked by At

i need to show $f(x)=-\log{\frac{exp(C_{\{k,:\}}x)}{\sum_j exp(C_{\{j,:\}}x)}}$ is convex. $x \in R^n$ and $exp(v)$ is the element-wise exponential of $v$ and $C \in R^{d \times n}$. Notation $C_{\{k,:\}}$ means the $i$th row of $C$.

In fact it is the intersection of 3 functions, $\{-log(p),\frac{exp(p_k)}{\sum_j exp(p_j)}, Cx\}$. I tried to calculate Hessian, but i obtained a complicated matrix with too many terms to show it is PSD.

I know it is PSD because i used matlab Hessian approximation and tried that with all kinds of $x$ and the result was PSD.

Besides proving Hessian being PSD, is there any other way easier to prove its convexity?

2

There are 2 best solutions below

0
On BEST ANSWER

If I understand your question correctly, you're asking how to show the function $f(x) = -\log\left(\frac{e^{\theta_i^T x} }{\sum_{j=1}^N e^{\theta_j^T x}} \right)$ is convex in $x\in \mathbf{R}^n$? If so, you can easily rewrite this as

$$f(x) = -\theta_i^T x - \log\left(\sum_{j=1}^N e^{\theta_j^T x}\right)$$

The first term is obviously convex in $x$ (specifically linear), and the second term is the negation of the log-sum-exp function. I think what you meant to say about the intersection above, is that this function is actually the composition of three functions.

There are some conditions (section 3.2.4 of Boyd) under which compositions of functions are sufficient to yield convex/concave functions but the log-sum-exp is a standard example of a function that cannot be proved concave by these rules (they are sufficient, not necessary).

You can compute the Hessian of this and use the Cauchy-Schwarz inequality to prove concavity. The negation is then obviously convex. See page 74 of Stephen Boyd's book on convex optimization

0
On

I am not sure what you mean by "intersection of 3 functions". I also don't know if your $f$ is a function since you talk about element-wise exponentials.

The class of convex functions is closed under addition, multiplication by nonnegative scalars, and taking the supremum, having first an affine function etc.

As a first step, I would simplify using properties of the log. Next, there is a term appeared of the form log-sum-exp (from the denominator). This function is known to be convex (and yes, the cheapest proof is through the Hessian). Together with some calculus rules, this should give your convexity.