Global and local maxima in a weighted sum of logarithms of linear functionals?

225 Views Asked by At

Is is possible to describe, and locate efficiently, the maxima of the function below in the parameters $\mathbf{x}$

$$\sum_{i} p_i \log( N + \sum_j x_j[B_j +(A_j-B_j)\delta_{ij} + min(A_j,B_j) ]) \quad [1] $$

There are 2 constraints.

Firstly:

$$ \forall j \quad 0 \le x_{j} \le N_j \quad [2] $$

$\mathbf{x}$ are non-negative real numbers in the first approximation and must only be non-negative integers in a version closer to the real-world problem.

Secondly:

$$-\sum_j x_jmin(A_j,B_j) \le N \quad [3]$$

where the constants, $\forall i,j$ are:

  • $A_j, B_j$ are rational numbers such that $A_jB_j <0$

  • $\delta_{ij}$ is the Kronecker delta (1 if $i=j$, $0$ otherwise)

  • $ p_i \ge 0$ are postive real numbers such that $\sum_ip_i=1$

  • $N_j$ and $N$ are positive integers.

The gradient function is such that the derivative wrt $x_k$ is

$$\sum_i \frac{p_i(B_k +[A_k-B_k]\delta_{ik} + min(A_k,B_k) )}{N + \sum_j x_j[B_j +(A_j-B_j)\delta_{ij} + min(A_j,B_j) ]} \quad [4]$$

and the hessian is defined so the derivative wrt $x_{k}$ and the $x_r$ is

$$-\sum_{i} \sum_{l} \frac{ p_ip_l(B_k +[A_k-B_k]\delta_{ik}+min(A_k,B_k))(B_r +[A_r-B_r]\delta_{lr} + min(A_r,B_r))}{(N + \sum_j x_j[B_j +(A_j-B_j)\delta_{ij} + min(A_j,B_j) ])^2} \quad [5]$$

Finding zeros of the gradient function doesn't help find the optima. (I don't know if some manipulation using partial fraction techniques may help.)

I would like to know of the uniqueness of global maxima and how it relates to local maxima.

How would things change if we replaced $\log()$ with $exp(-())$, a quadratic form or Kahneman and Tversky's value function,

$v(y) = y^a$ if $y>0$, $-\lambda(-y)^{\beta}$ otherwise.

I'm not sure how peaks and troughs are carried over in a weighted sum.

I'd like to use numerical optimizers, but I keep finding local maxima, which could be due to step size issues or maybe the function is very rough.

Meta-heuristics does reasonably well, but is computationally expensive and I believe better understanding of the problem will cause dramatic improvement.