Graph optimization problem linked with maximal spanning tree.

107 Views Asked by At

Let $G$ be a non-oriented connexe graph made of $N$ nodes $B_i$. We suppose that they are not any self-connected node.

For every $(i,j) \in \{1, \dots, N \}$, we note $\nu_{ij} \geq 0$ a positive weigth on the associated edge $ij$. If $\nu_{ij}=0$, that simply means that the nodes $B_i$ and $B_j$ are not really connected by an edge.

For a collection of real $(b_{ij})_{1 \leq i<j \leq N}$, consider the following problem of optimization :

\begin{equation} \min_{(u_1,\dots,u_N) \in \mathbb{R}^N} \sum_{i=1}^N \sum_{j=1}^{i-1} |b_{ij}+u_i-u_j| \nu_{ij} \tag{1} \end{equation} Solving this problem is equivalent to cut some edges out of the graph $G$, the idea is to chose well the coefficient $u_i$ so as to erase the biggest contributions $\nu_{ij}$. I would like to prove that the edges which are going to be 'erased' (i.e the edges such as $|b_{ij}+u_i-u_j|=0$) are exactly the edges of a maximal spanning tree $T$ of $G$.

I recall here the defitinition of a maximal spanning tree :

A graph $T=((B_i)_{1 \leq N},A_T)$ is said to be a maximal spanning tree of $G$ if for every $(i,j) \in (1, \dots, N)$, the edge $ij$ has a weight of $x_{ij} \nu_{ij}$, with the $(x_{ij})_{1 \leq i < j \leq N}$ be a solution of :

\begin{align} \max_{x_{ij} \in (0,1)^{\frac{N(N-1)}{2}}} \sum_{i=1}^N \sum_{j=1}^{i-1} x_{ij} \nu_{ij} \tag{2} \end{align} with the following constraints :

  • the graphe formed with coefficient $x_{ij} h_{ij}$ has no cycle in it

  • $$\sum_{j=i+1}^N x_{ij} + \sum_{j=1}^{i-1} x_{ji} \geq 1, \quad \forall i \in \{1,\dots,N \}$$

These two conditions insures that the graph obtained with coefficient $x_{ij} \nu_{ij}$ is indeed a tree.

Remark that I'm not sure that the result I want to prove is true. I'have tried this construction on several little graphs but I found no counter-example, but I'm not very used to graph theory. Any remarks or advices are welcomed.

2

There are 2 best solutions below

7
On BEST ANSWER

There is always going to be an optimal solution given by a spanning tree, but that spanning tree depends on the $(b_{ij})$ vector and not just the costs, and in general is not going to be the spanning tree with maximum weight. I give an example where it's not at the end of this answer.

You can find the optimal spanning tree with linear programming; your objective function is a linear program, once you replace absolute values by inequalities. I think that, similar to a min-cost flow problem, you can first take the dual (so that you have variables for edges, not vertices, and a spanning tree is a basic solution) and then do a "tree simplex method" where you pivot on a non-tree edge, bringing it into the basis and removing one of the existing edges of the tree.


Fact 1. There is an optimal solution such that edges with $|b_{ij} + u_i - u_j|$ ("erased" edges) form a connected graph.

To see this, suppose you have a solution which is not connected: we can split the vertices of $G$ into two sets $S, \overline{S}$ such that none of the edges between $S$ and $\overline{S}$ are erased. Now consider changing $u_i \to u_i + \delta$ at all vertices in $S$ only. The effect of this is to change $|b_{ij} + u_i - u_j|$ to $|b_{ij}+u_i - u_j + \delta|$ for all edges between $S$ and $\overline{S}$; as a function of $\delta$, the objective function looks like $$\sum_{i \in S} \sum_{j \in \overline{S}} |C_{ij} + \delta|\nu_{ij}$$ for some constants $C_{ij}$. This is piecewise linear in $\delta$, with slopes changing at values $\delta = -C_{ij}$ only; by assumption, $\delta=0$ is not such a corner point, because we don't have any erased edges between $S$ and $\overline{S}$. So starting from $\delta=0$ (our original optimal solution), you can always either increase or decrease $\delta$ to hit one of those points, without making the solution worse (so we retain optimality). This changes the solution to one with an additional erased edge between $S$ and $\overline{S}$.

Fact 2. For every spanning tree $T$, there is a unique solution $(u_i)$ - not necessarily optimal - such that the erased edges are exactly the edges of $T$, up to adding a constant to all $u_i$.

This is because we can "grow" $T$ by starting with a single vertex (whose value of $u_i$ we can choose arbitrarily), and adding a leaf vertex over and over. As we do so, the leaf vertex's value of $u_i$ is uniquely determined by the requirement that its edge to its neighbor in $T$ satisfy $b_{ij}+u_i-u_j=0$.


Generically, for most $(b_{ij})$, it will not be possible to have a cycle of erased edges. A cycle is a path plus another edge; the path of erased edges already fixes the value of $(u_i)$ on all its vertices, up to translation. The last edge can only be erased by coincidence, if its value of $b_{ij}$ happens to be correct. Even a tiny perturbation of $(b_{ij})$ will eliminate all cycles in the optimal solution.

This tells us that it's enough to consider solutions where the graph of erased edges is acyclic and connected: a spanning tree of the graph.

However, in the following problem, taking the max-weight spanning tree is the wrong thing to do:

enter image description here

The max-weight spanning tree is the path along the weight-$2$ edges. Going from left to right, the MST solution sets $u_1=0$, $u_2 = 4$, $u_3 = 11$, $u_4 = 14$, up to translation. The total cost of this solution is $|6+0-11|+|5+4-14|+|8+0-14|$ or $16$.

Another spanning tree is the star using edges from the leftmost vertex to all others. The spanning tree solution from the star sets $u_1=0$, $u_2 = 4$, $u_3 = 6$, $u_4 = 8$. The total cost of this solution is $2|7+4-6| + 2|3+6-8|+|5+4-8|$ or $13$.

2
On

Your objective function $f : u \mapsto \sum_i \sum_j |b_{ij} + u_i - u_j| v_{ij}$ has upper- and lower-partial derivatives for each $u_k$. For a specific $u_i$, the upper-partial derivative will be $\frac {\partial f} {\partial u_i^+} = \sum_{j \in P_i} v_{ij} - \sum_{j \notin P_i} v_{ij}$, where $j \in P_i$ iff $b_{ij} + u_i - u_j \ge 0$ ; and the lower-partial $\frac {\partial f} {\partial u_i^-}$ is the same with a strict inequality.

The global minimum has to be a local minimum, so at the optimal solution (which exists : if $||u|| \to \infty$ then $f(u) \to \infty$) the upper partial derivative needs to be $\le 0$ and the lower partial derivative needs to be $\ge 0$. Moreover, since your objective function is convex, this condition is sufficient.

A cool property is that the curvature is null (second partial derivatives are zero) when defined, so as your intuition suggests there exists a solution where each $u_i$ is at point where its partial derivative isn't defined (the upper- and lower- are different), which means that for each $i$ there is a $j$ such that $b_{ij} = u_i - u_j$.

But that point isn't necessarily the one with the biggest $v_{ij}$. It's just a point where there's a compensation between the positive $v_{ij}$ and the negative $v_{ij}$ in the partials... It's more probable that it occurs around a "large" $v_{ij}$, but not necessary.

To solve your problem, you should rather look towards convex optimization methods. I'm a little rusty on those so I won't give you direct references, but I'm sure someone else will be happy to do it. :)