Question: How does making a spanning tree solution "strongly feasible" for the Network Simplex method assist in cycling prevention?
To start, I was reading An Implementation of Network Simplex Method (Big-M) for solving Minimum Cost Flow Problem (Hassan and Ghalawingy) where they say on page $4$:
Strongly Feasible Trees Property: A feasible tree $T$ with corresponding flow vector $x$ is said to be strongly feasible if every arc $(i, j)$ of $T$ with $x_{ij}=0$ is oriented away from the root.
The artificial starting solution described above generates a strongly feasible basis, for consider each node, $i\in N$.
This was presented as a definition rather than an explanation as to why this is beneficial, so I dug around in the references and located Theoretical Properties of the Network Simplex Method" (Cunningham), page $200$:
We define a strongly feasible tree to be a feasible tree $T$ whose associated tree solution $x^0$ satisfies: every edge $f \in T$ such that $x^0_f = 0$ is directed away from $r$ in $T$. (Here $f$ is directed away from $r$ in $T$ if f is an edge of the path in $T$ from r to $h(f)$; if $f \in T$ is not directed away from $r$, then $f$ is directed toward $r$ in $T$.) The refinement of the simplex method suggested in [6] is simply to maintain strongly feasible trees, and this is done by using the leaving-edge rule: Choose $f$ of Step $3$ to be the first edge of $C(T, e)$ which is an element of $F$. It is easy to show that this rule does preserve strong feasibility, and also to prove the following result; for further details, see [6].
Which is another definition, but at least it gave a source, "A Network Simplex Method" (Cunningham), so I looked at that and on page $108$:
Definition $1$. We say that $T$ is strongly feasible if every $f \in T$ with $x^0_f=0$ is directed away from $r$ in $T$.
Seems like it's just definitions all the way down. While I do understand that it's proposed as a way to deter cycling, I don't understand how this does that - we only really have control over how arcs are directed when they are initialized in the early stages of the problem, but cycling is mostly expected to occur in the later stages of solver. Is this just a way to handle zero-flow arcs coming in and out of the basis during the exchange process, initialization, or both?
For some reason, you have stopped reading Cunningham's "Theoretical Properties of the Network Simplex Method" just as it was about to answer your question. We read on:
In particular, this suggests that the pivoting rule mentioned earlier prevents cycling, because it ensures that our spanning tree solution remains strongly feasible.
Some questions you may have:
Q1: Why does this pivoting rule (i.e. the rule "Pick the first possible leaving edge on $C(T,e)$") ensure that we get a strongly feasible tree?
A1: The only edges of value $0$ whose status changes from $T$ to $T'$ are the edges in $F$, the set of possible leaving edges - and also $e$, in the case of a degenerate pivot. So these are the only ones we have to check to determine if $T'$ is still strongly feasible.
Let $f = (x,y)$ be the leaving edge. The $r-x$ path in $T$ contains no other edges of $F$, by our choice of $f$. On the $r-y$ path in $T$, all edges of $F$ are forward edges, because they are all reverse edges on $C(T,e)$, and the $r-y$ path has the opposite orientation.
In case of a degenerate pivot, $f$ must occur after $e$ in the cycle, because if it had been before $e$, it would have been a $0$-value edge of $T$ pointing toward the root. So $e$ is somewhere on the $r-x$ path in $T$, whose direction agrees with $C(T,e)$, and it is a forward edge of $C(T,e)$; hence it points away from the root in $T$.
Q2: Why is it true that "$\pi_v(T') = \pi_v(T) + c_e(T)$ if $f$ is an edge of the path in $T$ from $r$ to $v$, and $\pi_v(T') = \pi_v(T)$ otherwise"?
A2: If $f$ is not an edge of the path in $T$ from $r$ to $v$, then that path is also the (unique) path in $T'$ from $r$ to $v$, so the cost of that path does not change.
If $f$ is an edge of the $r-v$ path in $T$, we have to work a bit harder. To do this, we'll find an $r-v$ walk in $T'$ which has total cost $\pi_v(T) + c_e(T)$. A walk in a tree differs from a path only in one way: sometimes, it backtracks along some edges it's used. Segments of the walk that are backtracked don't affect the cost, so we'll conclude that the $r-v$ path in $T'$ also has cost $\pi_v(T) + c_e(T)$.
What's this $r-v$ walk, then? Well, as before, let $f = (x,y)$. First, walk along the $r-v$ path in $T$ from $r$ until you get to $y$. Then, follow the cycle $C(T,e)$ from $y$ back to $y$. Then, follow the remainder of the $r-v$ path in $T$ from $y$ to $v$.
A key detail here: $f$ is a reverse edge of $C(T,e)$, because that's always true of every leaving edge. On the other hand, because we're doing a degenerate pivot, $f$ has value $0$, and because $T$ is strongly feasible, it is oriented away from $r$. Therefore $f$ is a forward edge of the $r-v$ path in $T$. As a result, our walk will backtrack along $f$, and so it will really be a walk in $T'$.
Q3: Why does this mean that $\sum(\pi_v(T) : v\in V)$ is strictly decreasing in a sequence of degenerate pivots?
A3: The sum changes, in total, by $k \cdot c_e(T)$, where $k$ is the number of vertices $v$ for which the $r-v$ path in $T$ contained $f$. Here, $k>0$ (at least one such vertex is the endpoint of $f$ further from $r$) and $c_e(T) < 0$ (because that's how we choose the entering edge $e$), so the total change is negative.
Q4: Why does this imply that there cannot be any cycling?
A4: For a tree $T$, consider the value $M \cdot (\text{cost of }T) + \sum(\pi_v(T) : v\in V)$, where $M$ is a sufficiently large value. On a non-degenerate pivot, this value strictly decreases because the first term strictly decreases. On a degenerate pivot, this value strictly decreases because the first term stays the same and the second term strictly decreases. Therefore this value forms a strictly decreasing sequence through all the pivots we do. It follows that we can never see the same tree $T$ twice, because we never see the same value twice.