What is an approach for optimizing the values of a matrix?

1.1k Views Asked by At

My apologies if I get some terminology wrong, I don't have a formal math background; half my problem is articulating what I'm trying to do and identifying the domain of math that deals with this kind of problem.

I am working on a problem where I would like to run an optimization algorithm to find values for an n-dimensional dense matrix (actually, the matrix is the conditional probability table of a node on a probabilistic graphical model, but I'm not sure if that's relevant)

I'm familiar with some basic types of optimization such as constraint logic programming and linear programming. However, I'm having a hard time formalizing my problem because it doesn't fall neatly into either category; it inherently has both one continuous dimension and any number of discrete dimensions.

I'll give a concrete example. Say I have three sets, each with three members:

X = {x₀, x₁, x₂}
Y = {y₀, y₁, y₂}
Z = {z₀, z₁, z₂}

If I take one element from each set, there are 27 possible permutations (or 27 elements, if you view the sets as axes of a matrix). For each of those permutations I would like to assign a real number p between 0 and 1, subject to certain constraints and an optimization function.

For example, say I wanted to give the following constraints:

  • For each x ∈ X and y ∈ Y, the sum of {x, y, z₀...z₂} = 1
  • For each x ∈ X and y ∈ Y, {x, y, z₀} ≥ {x, y, z₁} + 0.5
  • {x₂, y₁, z₂} = 0.5

Using an optimization function keep each p as close as possible, the following matrix population might be a solution (done by hand, so there might be errors):

|----+-----+-----+-----+-----+-----+-----+-----+-----+-----|
|    |        x₀       |        x₁       |        x₂       |
|----+-----+-----+-----+-----+-----+-----+-----+-----+-----|
|    |  y₀ |  y₁ |  y₂ |  y₀ |  y₁ |  y₂ |  y₀ |  y₁ |  y₂ |
|----+-----+-----+-----+-----+-----+-----+-----+-----+-----|
| z₀ | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 | 0.6 |
| z₁ | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
| z₂ | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 |
|----+-----+-----+-----+-----+-----+-----+-----+-----+-----|

What is a mathematical approach for the general version of this problem, with N discrete dimension, N constraints, and an arbitrary optimization function?

2

There are 2 best solutions below

1
On

According to the comments, I take the function to be minimized as $\max (p)-\min(p)$, with maximum and minimum taken over all $p$-values in the tables. If we did not have the constraint $p(x_2,y_1,z_2)= 0.5$, a natural solution for general $N$ would be: $$ p(x,y,z_0) =\frac{1}{2}+\frac{1}{2N}, \quad p(x,y,z_i) = \frac{1}{2N} \quad \text{ for } i\ge 1 $$ Indeed, the first two constraints are satisfied by the above, and the spread between the maximal and minimal $p$ is $1/2$, which is as small as it can be (due to the second constraint).

For $N=3$ the above gives $(2/3,1/6,1/6)$ in each column, which are closer together than the values in your example.

But then there's the constraint $p(x_2,y_1,z_2)=0.5$. It immediately determines the values in the $(x_2,y_1)$-column, because $p(x_2,y_1,z_0)\ge0.5$, and the sum of column must be $1$. Thus, $p(x_2,y_1,z_0)=0.5$, $p(x_2,y_1,z_2)=0.5$, and all other entries in that column are zero.

The presence of zero somewhere makes $\min (p)=0$. Then the optimal choice is to let all columns to be the same: $(1/2,0,1/2,0,0,0,\dots,0)$. Now all $p$ values are between $0$ and $1/2$. However, having most probabilities as zeros is probably not what you had in mind for your model. You may want to revise the optimization design.

0
On

For reference: I ended up solving my problem using linear programming to optimize the values in the matrix by turning each cell into its own variable.

As it happens, I could express all the constraints I was interested in terms of linear inequalities.

The optimization function was a bit more difficult because I didn't want to maximize or minimize on any axis. Rather, I wanted to find the center of the feasible region. However, it turns out that its possible to manipulate all the constraints to introduce a Chebychev hypersphere (a hypersphere constrained to fit within the feasible region). When you maximize the radius of the hypersphere, the center of the sphere will be somewhere near the center of the feasible region.