Solve the closed form solution for argmax of $ x^Ty - x^T\ln(x)$

985 Views Asked by At

Let $y \in \mathbb{R}^n, \ln(x) = \begin{bmatrix} \ln(x_1) \\ \vdots \\ \ln(x_n) \end{bmatrix} \in \mathbb{R}^n$

Show that

$$x^* = \text{argmax}_{x \in \mathcal{D}} \quad x^Ty - x^T\ln(x)$$

where $$\mathcal{D} = \{x \in \mathbb{R}^n| \sum\limits_{i = 1}^n x_i = 1, x_i \geq 0\}$$

Has a closed formed solution, where:

$x^*_i = \dfrac{\exp(y_i)}{\sum\limits_{j = 1}^n \exp(y_j)}$

I have tried to use duality to solve this problem, but it seems overly complicated, and introduces langragian multipliers from both equality and inequality constraint I cannot easily get rid of, can someone show me to show prove this equality easily?


Consider the optimization problem:

$$\text{max}_{x \in \mathcal{D}} \quad x^Ty - x^T\ln(x)$$

$$\text{subject to} \sum\limits_{i = 1}^n x_i = 1, x_i \geq 0$$

This is a max over a concave function, so equivalently

$$\text{min}_{x \in \mathcal{D}} \quad -x^Ty + x^T\ln(x)$$

$$\text{subject to} \sum\limits_{i = 1}^n x_i = 1, -x_i \leq 0$$

----edited following A.G.'s suggestion to eliminate inactive constraints----

Then the Lagrangian of the convex optimization problem is:

$$L(x, \nu) = x^T(-y + \ln(x))+ \nu(\mathbf{1}^Tx - 1)$$

Then KKT first order condition states:

$$\nabla L(x^*,\nu^*) = 0$$

Taking the gradient we have:

$$(-y + \ln(x)) + \mathbf{1}+\nu\mathbf{1}= 0$$

In components we have

$$-y_i + \ln(x_i) + 1 + \nu = 0 \implies x_i = \exp(y_i - 1 -\nu)$$

So

$$x^* = \exp(y - \mathbf{1} - \nu\mathbf{1})= \begin{bmatrix} \exp(y_1-1-\nu) \\ \vdots \\ \exp(y_2-1-\nu) \end{bmatrix}$$


We forge the dual function:

$$g( \nu) = {x^*}^T(-y + \ln(x^*))+ \nu(\mathbf{1}^Tx^* - 1)$$

$$\implies g( \nu) = -(1+\nu)\mathbf{1}^T((\exp(y - \mathbf{1} - \nu\mathbf{1})) + \nu(\mathbf{1}^T(\exp(y - \mathbf{1} - \nu\mathbf{1})) - 1)$$


Then we proceed to solve the dual problem:

$$\max g( \nu) =-(1+\nu)\mathbf{1}^T((\exp(y - \mathbf{1} - \nu\mathbf{1})) + \nu(\mathbf{1}^T(\exp(y - \mathbf{1} - \nu\mathbf{1})) - 1)$$

with no constraints

Let $\exp(y - \mathbf{1} - \nu\mathbf{1}) = \phi$

Then $$g( \nu) =-(1+\nu)\mathbf{1}^T\phi + \nu(\mathbf{1}^T(\phi - 1)$$ $$\implies g( \nu) = -\mathbf{1}^T\phi - \nu$$

Proceed by solving maximization problem:

$$\nabla [-\mathbf{1}^T\phi - \nu^*] = \nabla [-\mathbf{1}^T(\exp(y - \mathbf{1} - \nu\mathbf{1})) - \nu^*] = 0$$

$$\implies \mathbf{1}^T(\exp(y - \mathbf{1} - \nu^*\mathbf{1})) - 1 = 0$$

$$\implies \mathbf{1}^T(\exp(y - \mathbf{1} )) = \exp(\nu^*)$$ $$\implies \ln[\mathbf{1}^T(\exp(y - \mathbf{1})] = \nu^*$$


Finally

$$x^* = \exp(y - \mathbf{1} - \nu\mathbf{1}) = \exp(y - \mathbf{1} - \ln[\mathbf{1}^T(\exp(y - \mathbf{1})]\mathbf{1})$$

$$\implies x^* = \dfrac{\exp(y - \mathbf{1})}{\mathbf{1}^T(\exp(y - \mathbf{1}))\mathbf{1})}$$

$$\implies x^* = \dfrac{\exp(y - \mathbf{1})}{\sum\limits_{i=1}^n\exp(y_i - 1)}$$

$$\implies x_i^* = \dfrac{\exp(y_i -1)}{\sum\limits_{j=1}^n\exp(y_j - 1)} = \dfrac{\exp(y_i)}{\sum\limits_{j=1}^n\exp(y_j )}$$ Phew!

Can someone check correctness?

1

There are 1 best solutions below

0
On BEST ANSWER

Two comments:

  1. When you find the KKT point for the Lagrangian minimization - how do you know that it is actually the minimizer? A KKT point is a generalization of a stationary point in unconstrained minimization, and those as we know can be minima, maxima or saddle points. (Hint: the problem is convex).
  2. When you are done maximizing the dual function - how do you know that the point you have found is actually the solution? (Hint: no duality gap).

Easier solution: as the original problem is convex (verify!), any KKT point is a global minimizer. One KKT is $$ x_i=\exp(y_i-1-\nu)=\exp(y_i)\cdot \exp(-1-\nu)=\exp(y_i)\cdot C. $$ To find $C$, set it in the constraint $$ 1=\sum_{i=1}^nx_i=\sum_{i=1}^n\exp(y_i)\cdot C\quad\Leftrightarrow\quad C=\frac{1}{\sum_{i=1}^n\exp(y_i)}. $$