Lower bounding over probability distributions

55 Views Asked by At

For some probability distribution $p_\chi$ over an alphabet $\chi$ and some $n$, consider the following: $$ f(p_\chi) = \sum_{x \in \chi} p_{\chi}(x)^2(1-p_{\chi}(x))^{n-2}$$ My goal is to lower bound $f(p_\chi)$ over all probability distributions $\{ p_\chi : p_\chi(x) \ge \frac{1}{k}, \forall x \in \chi\}$ for some fixed $k$. I tried lower bounding each of the terms in the summation (which gets rid of the constraint that $\sum_{x \in \chi} p_{\chi}(x) = 1 $), but the resultant lower bound is very loose. Is there any general methodology I can use to approach this problem?

1

There are 1 best solutions below

1
On BEST ANSWER

Define $s=|\mathcal{X}|$ and assume $s/k\leq 1$, $n>2$. Define $h=1-(s-1)/k$ and note that, by assumption, $0<1/k\leq p(x) \leq h$ for all $x \in \mathcal{X}$.

This is a partial answer that shows, sometimes, the optimal solution is to allocate probability $1/k$ on $s-1$ of the alphabet symbols in $\mathcal{X}$, and $h$ on the remaining symbol. Define $(p^*(x))_{x \in \mathcal{X}}$ as this mass function. Other times this solution is not optimal and I suspect that the equal allocation $p_{equal}(x) = 1/s$ for all $x \in\mathcal{X}$ is likely optimal in such cases. Overall, Lagrange multipliers can help for this probelm. Below I show one use, another use is via Karush-Kuhn-Tucker conditions, see here: https://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions

Constrained problem:

\begin{align} \mbox{Minimize:} \quad & \sum_{x \in \mathcal{X}} p(x)^2(1-p(x))^{n-2} \\ \mbox{Subject to:} \quad & \sum_{x \in \mathcal{X}} p(x) = 1\\ \quad & 1/k \leq p(x) \leq h \quad \forall x \in\mathcal{X} \end{align}

Unconstrained problem

Fix $\lambda \in \mathbb{R}$ and call it a "Lagrange multiplier."

\begin{align} \mbox{Minimize:} \quad & \sum_{x \in \mathcal{X}} p(x)^2(1-p(x))^{n-2} + \lambda \sum_{x \in \mathcal{X}} p(x) \\ \mbox{Subject to:} \quad & 1/k \leq p(x) \leq h \quad \forall x \in\mathcal{X} \end{align}

Claim (Lagrange multipliers):

Fix $\lambda\in \mathbb{R}$. If $(p(x))_{x \in \mathcal{X}}$ is a solution to the unconstrained problem, and if $\sum_{x \in \mathcal{X}} p(x)=1$, then $(p(x))_{x \in \mathcal{X}}$ is also a solution to the constrained problem.

Proof: Let $(p(x))_{x \in \mathcal{X}}$ be a solution to the unconstrained problem that satisfies $\sum_{x \in \mathcal{X}} p(x)=1$. Then it satisfies all constraints of the constrained problem. Let $(w(x))_{x \in \mathcal{X}}$ be another vector that satisfies all constraints of the constrained problem. We want to show that $p$ yields an objective value for the constrained problem that is less than or equal to that of $w$. Since $1/k \leq w(x) \leq h$ for all $x \in \mathcal{X}$ we have: $$ \sum_{x \in \mathcal{X}} p(x)^2(1-p(x))^{n-2} + \lambda\underbrace{\sum_{x \in \mathcal{X}} p(x)}_{1} \leq \sum_{x \in \mathcal{X}} w(x)^2(1-w(x))^{n-2} + \lambda\underbrace{\sum_{x \in \mathcal{X}} w(x)}_{1}$$ and so $$\sum_{x \in \mathcal{X}} p(x)^2(1-p(x))^{n-2} \leq \sum_{x \in \mathcal{X}} w(x)^2(1-w(x))^{n-2} $$ Thus, $p$ is optimal for the constrained problem. $\Box$


Define $\lambda \in \mathbb{R}$ to satisfy: $$ (1/k)^2(1-(1/k))^{n-2} + \lambda (1/k) = h^2(1-h)^{n-2} + \lambda h$$ The unconstrained minimization separates over each $x \in \mathcal{X}$. For a given $x \in \mathcal{X}$ the unconstrained minimization is: \begin{align} \mbox{Minimize:} \quad & p(x)^2(1-p(x))^{n-2} + \lambda p(x) \\ \mbox{Subject to:} \quad & 1/k \leq p(x) \leq h \end{align} The function to be minimized is differentiable in $p(x)$, so the minimum is at a critical point, being either an endpoint $1/k$ or $h$, or a point in between that has zero derivative. I chose the above value $\lambda$ so that both endpoints $x=1/k$ and $x=h$ achieve the same value for the expression: $$p(x)^2(1-p(x))^{n-2} + \lambda p(x)$$ In certain cases, these two endpoints $1/k$ and $h$ tie for minimizing this expression. Hence, in these cases, the mass function $(p^*(x))_{x \in \mathcal{X}}$, which uses only values $1/k$ or $h$, solves the unconstrained problem and satisfies $\sum_{x \in \mathcal{X}} p(x) = 1$, so it also solves the constrained problem.

Specifically, $p^*$ is optimal when we evaluate the following expression over $1/k \leq x \leq h$: $$p(x)^2(1-p(x))^{n-2} + \lambda p(x)$$ and when this expression is optimized at the endpoints (both endpoints of this expression will always have the same value by definition of $\lambda$).

I tested specific $(s,k,n)$ values and plotted $p^2(1-p)^{n-2}+\lambda p$ in matlab over the interval $[1/k,h]$. I get:

  • $(s,k,n)=(5,10,4)$: Picture shows endpoints optimal, suggesting $p^*$ optimal. $p^*$ beats equal allocation.

  • $(s,k,n)=(3,10,4)$: Picture shows endpoints optimal, suggesting $p^*$ optimal. $p^*$ beats equal allocation.

  • $(s,k,n)=(15,80,4)$: Picture shows endpoints optimal, suggesting $p^*$ optimal. $p^*$ beats equal allocation.

  • $(s,k,n) = (8,10,4)$: Picture shows endpoints not optimal. Equal allocation is better than $p^*$.

  • $(s,k,n) = (100,2004)$: Picture shows endpoints not optimal. Equal allocation is better than $p^*$.