I was studying the proof of $1.5$-Approximate Path TSP problem and found a tiny wrinkle that I just can't get over it.
In the proof it says that the minimum spanning tree found by standard MST algorithm satisfies
$$l(T)\leq l(y)$$
where $y \in P_{ST}$ and $P_{ST}$ is the spanning tree polytope of the same graph, which is defined as follows:
$P_{ST}=\{x\in \rm I\!R^E_{\geq0}\:|\:x(E)= |V|-1\:and\:x(E[S])\leq|S|-1\:\forall\: S\subsetneq V\:and\:S\neq \emptyset\}$
My question is, the spanning tree polytope seems to be a kind of "relaxation" of normal spanning tree, so does the MST algorithm always produce the optimal solution on $P_{ST}$?
If not, then why does the inequality hold true?
Ref:
[1] https://arxiv.org/pdf/1805.04131.pdf 1.5-Approx. Path TSP original paper
[2] http://www.birs.ca/events/2018/5-day-workshops/18w5088/videos/watch/201809260903-Naegele.html Video explaining it. My question appears around 34 minute.
Yes: all optimal solutions on $P_{ST}$ are minimum-weight spanning trees.
This is Theorem 4.25 in Integer programming by Conforti, Cornuéjols, and Zambelli. The gist of the argument is as follows.
For arbitrary weights $w \in \mathbb R^{E}$, we find the maximum-weight spanning tree by Kruskal's algorithm. (It is slightly nicer to maximize than to minimize here, for finding the dual.) To show that the corresponding $x \in P_{ST}$ is optimal, we find a dual solution $y$ that satisfies complementary slackness with $x$.
The dual here is $$\max\left\{\sum_{S\ne\emptyset} (|S|-1)y(S) \,\middle|\, \sum_{S : e \in E[S]} y(S) \ge w(e)\,\forall e \in E, \; y(S) \ge 0 \, \forall S \ne V\right\}.$$ Let $e_i$ be the edge added in step $i$ of Kruskal's algorithm, with $S_i$ the connected component formed by adding $e_i$. (At the end, $S_{n-1}=V$.) We set $y(S_{n-1}) = w(e_{n-1})$ and $y(S_i) = w(e_i) - w(e_j)$ if $S_j$ is the next connected component containing $S_i$.
For an edge $e$ first contained in $E[S_i]$, the sum $\sum_{S : e \in E[S]} y(S)$ telescopes to $w(e_i)$, which satisfies the constraint (Kruskal ensures that $w(e_i) \ge w(e)$) and complementary slackness in the dual. Meanwhile, the optimal solution $x$ found by Kruskal's algorithm satisfies $x(E[S_i]) = |S_i|-1$ because $S_i$ is a connected component after the $i^{\text{th}}$ step: primal complementary slackness also holds.
Therefore for any weights $w$, Kruskal's algorithm gives us the optimal point in $P_{ST}$; it follows that all vertices of $P_{ST}$ are spanning trees.