Minimizing $\frac{\|p\|_2^2}{\|p\|_\infty}$ for probability vector $p$

153 Views Asked by At

Suppose $p$ is a discrete probability distribution over $d$ outcomes. I need to minimize the following

$$J=\frac{\|p\|_2^2}{\|p\|_\infty}$$

For $d=4$, Mathematica suggests the minimum is $\frac{2}{3}$ achieved at $\left(\frac{1}{2},\frac{1}{6},\frac{1}{6},\frac{1}{6}\right)$. Is there a formula for general $d$?

Notebook

Motivation

This provides the shape of quadratic $H$ for which a single step of gradient descent with $1/\|H\|$ step size makes the least progress. Hence it gives a bound on progress of gradient descent in general.

2

There are 2 best solutions below

0
On BEST ANSWER

We need to find the minimum of the function $${x_1^2+x_2^2+\dots +x_d^2\over \max x_k},\quad x_1+x_2+\dots +x_d=1,\quad x_k\ge 0$$ We may assume $x_d\ge x_k$ for $k<d.$ Then the function takes the form $$f(x_1,x_2,\dots,x_d)={x_1^2+x_2^2+\dots +x_d^2\over x_d}$$ We apply the Lagrange multiplier method to get the system of equations $$\begin{eqnarray*}{2x_k\over x_d}&= &\lambda , \quad 1\le k\le d-1\\x_d-{x_1^2+\dots +x_{d-1}^2\over x_d^2}& =& \lambda \end{eqnarray*}$$ Thus $$x_1=x_2=\dots =x_{d-1}$$ Denote $x_d={t\over d}.$ Then $x_k={d-t\over (d-1)d}.$ The problem reduces to minimizing the one variable function $$h(t)={{(d-t)^2\over (d-1)d}+{t^2\over d}\over t},\quad 1\le t\le d$$ We have $$(d-1)d\,h(t)=dt+{d^2\over t}-2d$$ The minimal value $m$ is attained at $t=\sqrt{d},$ i.e. $x_d={1\over \sqrt{d}},$ and $m={2\over\sqrt{d}+1}.$

0
On

Consider the homogenized version of the problem $$\min_{p\geq0,\ p\neq0}\frac{\|p\|_2^2}{\|p\|_\infty\|p\|_1}.\tag{1}$$ Assume that the largest element of $p$ is $p_d=1$. Then problem $(1)$ is equivalent to $$\min_{0\leq v\leq1,\ v\in\mathbb R^{d-1}}\frac{v^Tv+1}{v^Te+1}\tag{2}$$ where $e=(1,1,\ldots,1)^T\in\mathbb R^{d-1}$ and $v=(p_1,p_2,\ldots,p_{d-1})^T$. Now consider the similar problem with a relaxed constraint: $$\min_{v\geq 0,\ v\in\mathbb R^{d-1}}\frac{v^Tv+1}{v^Te+1}.\tag{3}$$ When $\|v\|_2$ is fixed, the denominator is minimized when $v=ce$ where $c=\|v\|_2/\|e\|_2$. Therefore problem $(3)$ can be reformulated as $$\min_{c\geq0}\frac{c^2(d-1)+1}{c(d-1)+1}.\tag{4}$$ Using the first and the second derivatives of the objective function, we find that the optimal solution is $c=\frac{1}{\sqrt{d}+1}$. Since this $c$ is $\leq1$, $v=ce$ is also the optimal solution to the unrelaxed problem $(2)$. Therefore $p=(\frac{1}{\sqrt{d}+1},\ldots,\frac{1}{\sqrt{d}+1},1)$ is the optimal solution to $(1)$ and $$p=\frac{1}{\sqrt{d}}\left(\frac{1}{\sqrt{d}+1},\ldots,\frac{1}{\sqrt{d}+1},1\right)$$ is the optimal solution to your original problem.