Pushing a certain amount of flow through a graph and vertice flow distributions close to some target vector

38 Views Asked by At

Let's say I'm designing fictional fruits. The fruits have two attributes, color and taste. There are $n$ possible colors and $m$ possible tastes. Further, we're given an $n \times m$ feasibility matrix dictating which colors are compatible with which tastes.

I want to create exactly $f$ fruits. Also, I have a vector of colors, $c$ such that $\sum_{i=1}^n c_i = f$ and a vector of tastes, $t$ such that $\sum_{i=1}^m t_i = f$.

I want the $f$ fruits to be created in a way that the final distribution of colors across the $f$ fruits is as close to $c$ as possible and that of tastes is as close to $t$ as possible. Concretely, let's say the actual distribution of colors across the $f$ fruits is given by $\hat{c}$ and that of tastes is $\hat{t}$, we want to minimize:

$$\sum_{i=1}^n (c-\hat{c})^2 + \sum_{j=1}^m (t-\hat{t})^2$$

What clever algorithms from graph theory or elsewhere can be used to do this efficiently?


My attempt: the colors and tastes form a bi-partite graph. So, we imagine a source node, connected to all colors with capacities dictated by the $c$ vector, then the colors are connected to tastes per the feasibility matrix (capacities $\infty$) and the tastes are connected to a sink node again with capacities dictated by the $t$ vector. We run a max flow. Some colors and tastes are covered; we eliminate them and repeat until all are covered. But this is probably not optimal either (and more importantly) in terms of objective function or efficiency.

1

There are 1 best solutions below

2
On

Let $E$ be the set of all feasible fruit-taste pairs. For each $(i,j) \in E$, let $x_{ij}$ be the amount of fruit $i$ with taste $j$ produced. Then the "ideal constraints" are that

  • for every fruit $i$, $\sum_{j : (i,j) \in E} x_{ij} = c_i$, and
  • for every taste $j$, $\sum_{i: (i,j) \in E} x_{ij} = t_j$.

This is a system of linear equations $Ax = b$, with a nonnegativity constraint $x \ge 0$. Of course, it's probably not feasible, so it's not our actual feasible region.

For the objective function in the question, we are minimizing $\|Ax - b\|_2$ over all $x \ge 0$, which is a non-negative least squares problem. There's a variety of methods to solve this.

(I originally forgot about the constraint $\sum_{(i,j) \in E} x_{ij} = f$. To make sure that this constraint is satisfied exactly and doesn't get "relaxed" in the least squares, we can solve for one of the $x_{ij}$ variables in terms of the others.)

Other variants to consider:

  • If we minimize $\|Ax-b\|_1$ instead (the sum of the absolute differences $|c_i - \hat{c}_i|$ and $|t_j - \hat{t}_j|$) then we can formulate the problem as a linear program: write $z_i \ge |c_i - \hat{c}_i|$ (a pair of linear constraints in the $x_{ij}$), $w_j \ge |t_j - \hat{t}_j|$ (another pair of linear constraints), and minimize $z_1 + \dots + z_n + w_1 + \dots + w_m$.
  • If we additionally want each $x_{ij}$ to be an integer, things become much more complicated. In both cases, we can probably use some kind of branch-and-bound technique for integer programming.