Prove Convexity of Recursively Defined Function

549 Views Asked by At

Let $\mathbf{x}=[x_1, x_2, \dots, x_K]\in\mathbb{R}^K_{++}$ and $E_1>E_2>\dots>E_K>0$ are positive constants. If

$$f_i:\mathbb{R}^K_{++}\rightarrow\mathbb{R}_{++}\quad\forall1\leq i\leq K$$ are mappings recursively defined as

$$f_1(\mathbf{x})=\frac{e^{x_1}-1}{E_1}~~\text{and}~~f_k(\mathbf{x})=e^{x_k}\left(f_{k-1}(\mathbf{x})+\frac{1}{E_k}\right)-\frac{1}{E_k}\quad\forall1<k\leq K$$

Prove that all $f_k(\mathbf{x})$ are convex in $\mathbf{x}$.

P. S. Obviously $f_1(\mathbf{x})$is convex. To prove convexity of others, I thought induction will be a straight forward method, particularly since the function is defined recursively. Additionally, deriving the Hessian matrix of $f_k(\mathbf{x})$ from that of $f_{k-1}(\mathbf{x})$ gives it a nice block structure. However, I am at a loss to prove that the final determinant will be positive.

2

There are 2 best solutions below

3
On BEST ANSWER

For functions of this particular structure, we can show convexity:

Let $g_m(\mathbf{x}) = \frac{e^{x_m} - 1}{E_m}$, and $h_m(\mathbf{x}) = \exp\left(\sum_{m+1}^K x_i\right)$ and for convenience, let's write $b_m(\mathbf{x}) = g_m(\mathbf{x})\cdot h_m(\mathbf{x})$.

Then

$$f_K(\mathbf{x}) = \sum_{m = 1}^K b_m(\mathbf{x}).\tag{1}$$

To see that, write $h_m^n(\mathbf{x}) = \exp \left(\sum_{i = m+1}^n x_i\right)$. We have $f_1(\mathbf{x}) = g_1(\mathbf{x}) = g_1(\mathbf{x})\cdot h_1^1(\mathbf{x})$. Assuming $f_k(\mathbf{x}) = \sum\limits_{m=1}^k g_m(\mathbf{x})\cdot h_m^k(\mathbf{x})$, the recursion yields

$$\begin{align} f_{k+1}(\mathbf{x}) &= e^{x_{k+1}}\cdot f_k(\mathbf{x}) + \frac{e^{x_{k+1}}}{E_{k+1}} - \frac{1}{E_{k+1}}\\ &= e^{x_{k+1}}\sum_{m=1}^k g_m(\mathbf{x})\cdot h_m^k(\mathbf{x}) + g_{k+1}(\mathbf{x})\\ &= \sum_{m=1}^k g_m(\mathbf{x}) \cdot\left(h_m^k(\mathbf{x})e^{x_{k+1}}\right) + g_{k+1}(\mathbf{x})\\ &= \sum_{m=1}^k g_m(\mathbf{x})\cdot h_m^{k+1}(\mathbf{x}) + g_{k+1}(\mathbf{x})\cdot h_{k+1}^{k+1}(\mathbf{x})\\ &= \sum_{m=1}^{k+1} g_m(\mathbf{x})\cdot h_m^{k+1}(\mathbf{x}), \end{align}$$

and hence, using $h_m(\mathbf{x}) = h_m^K(\mathbf{x})$, we have $(1)$.

Differentiation gives

$$\begin{align} \frac{\partial g_m}{\partial x_i}(\mathbf{x}) &= \begin{cases}g_m(\mathbf{x}) + \frac{1}{E_m} &, i = m\\ 0 &, i \neq m \end{cases}\\ \frac{\partial h_m}{\partial x_i}(\mathbf{x}) &= \begin{cases}0 &, i \leqslant m\\ h_m(\mathbf{x}) &, i > m \end{cases}\end{align}$$

and hence $$\begin{align} \frac{\partial b_m}{\partial x_i}(\mathbf{x}) &= \begin{cases}0 &, i < m\\ b_m(\mathbf{x}) + \frac{1}{E_m}h_m(\mathbf{x}) &, i = m\\ b_m(\mathbf{x}) &, i > m \end{cases}\\ \frac{\partial^2 b_m}{\partial x_i \partial x_j}(\mathbf{x}) &= \frac{\partial b_m}{\partial x_i}(\mathbf{x}), \quad i \leqslant j \end{align}$$

and

$$\frac{\partial f_K}{\partial x_i} = \sum_{m = 1}^i \frac{\partial b_m}{\partial x_i} = \sum_{m=1}^i b_m + \frac{1}{E_i}\cdot h_i; \quad \frac{\partial^2 f_K}{\partial x_i \partial x_j} = \frac{\partial f_k}{\partial x_{\min\{i,j\}}}.$$

Now let's define $A_k \in M_K(\mathbb{R})$ by

$$a^k_{i,j} =\begin{cases}0 &, \min \{i,\,j\} < k\\ 1 &, \min \{i,\,j\} \geqslant k,\end{cases}$$

so $A_k$ is a block matrix with $1$s in the lower right $(K-(k-1))\times (K-(k-1))$ block and $0$s everywhere else.

The results of the differentiation above yield

$$H_{f_K} (\mathbf{x}) = \sum_{k = 1}^K \left(\frac{\partial f_K}{\partial x_k}(\mathbf{x}) - \frac{\partial f_K}{\partial x_{k-1}}(\mathbf{x})\right)\cdot A_k,$$

where we symbolically used $\dfrac{\partial f_K}{\partial x_0} = 0$ for uniformity. Our expression for the partial derivatives of $f_K$ above yields

$$\begin{align} \frac{\partial f_K}{\partial x_k} - \frac{\partial f_K}{\partial x_{k-1}} &= \left(\sum_{m=1}^k b_m +\frac{1}{E_k}h_k\right) - \left(\sum_{m=1}^{k-1} b_m + \frac{1}{E_{k-1}}h_{k-1}\right)\\ &= b_k + \frac{1}{E_k}h_k - \frac{1}{E_{k-1}}h_{k-1}\\ &= \left(g_k + \frac{1}{E_k}\right)h_k - \frac{1}{E_{k-1}}h_{k-1}\\ &= \frac{1}{E_k}(e^{x_k}\cdot h_k) - \frac{1}{E_{k-1}}h_{k-1}\\ &= \left(\frac{1}{E_k} - \frac{1}{E_{k-1}}\right)h_{k-1}\\ &> 0. \end{align}$$

Since all $A_k$ are positive semidefinite, and $\dfrac{\partial f_K}{\partial x_k}(\mathbf{x}) - \dfrac{\partial f_K}{\partial x_{k-1}}(\mathbf{x}) > 0$, it follows directly that $H_{f_K}$ is positive semidefinite, and since

$$\langle v \mid A_k v\rangle = \left(\sum_{j = k}^K v_j\right)^2,$$

you have $\langle v \mid H_{f_K} v\rangle = 0 \Rightarrow \bigl(\forall m\bigr)\left(\sum\limits_{j = m}^K v_j = 0\right) \Rightarrow v = 0$, i.e. $H_{f_K}$ is positive definite. Hence $f_K$ is strictly convex.

0
On

So there are a few basic properties of convex functions that are good to know:

First, the composition of a 1-dimensional convex function and a linear function is convex. That is, if $\phi(t)$ is a 1-dimensional convex function, and $\mathbf{v}\cdot\mathbf{x}$ is a linear function, then the composition $\phi(\mathbf{v}\cdot\mathbf{x})$ is a convex function of $\mathbf{x}$.

Second, positive combinations of convex functions are convex. That is, if $h_i(\mathbf{x})$ are convex functions and $c_i$ are positive coefficients, then the positive combination $\sum_i c_i h_i(\mathbf{x})$ is convex.

The general formula for $f_k$ is easy to check by induction: $$f_k(\mathbf{x})= \frac{1}{E_1} e^{\sum_{i=1}^k x_i} + \sum_{j=2}^k \left(\frac{1}{E_j} -\frac{1}{E_{j-1}} \right) e^{\sum_{i=j}^k x_i}-\frac{1}{E_k} $$ Notice that the terms $e^{\sum_{i=j}^k x_i}$ are convex since they are the composition of a 1-dimensional convex function (the exponential) with a linear function. Then, since the coefficients $\left(\frac{1}{E_j} -\frac{1}{E_{j-1}} \right)$ are positive, the function $f_k$ is a positive combination of convex functions, so it must be convex.