I am trying to proof the following algorithm to see if a there exists a path from u to v in a graph G = (V,E).
Reach := {u}
LastCardinality := 0
while LastCardinality < |Reach| do
LastCardinality := |Reach|
Next := null
foreach i <- Reach do
foreach j <- V do
if(i,j) exists in E then
Next := Next + {j}
Reach := Reach + Next
return Reach
So far I have got that the output needs to be Reach = Goal where Goal = {r | r exists V, there is a path from u to r}. I also have three invariants that I am pretty sure will help prove this.
- Reach is always a subset of Goal and Next is always a subset of Goal.
- After an arbitrary K times through the while loop, for every t that exsits in V, if t has a path of length k+1 from u, then t exists in Reach
- If Reach is a subset of Goal before the while loop then LastCardinality < |Reach|
I know that to finish up the proof, I need to prove termination, the invariants, and correctness but I have no idea how. I think I need to use induction on the while loop but I am not exactly sure how.
How do I proof those three characteristics about an algorithm?
Let's start with termination. You have an initial set $Reach$. Since you are traversing a connected component of $G$ and (we are assuming) $G$ is finite, only a finite number of nodes can exist in $Next$. And so $Reach := Reach + Next$ will eventually not add any additional nodes to $Reach$.
A better way to put that. Suppose $Reach \subset V(X)$ and $Reach \neq V(X)$, where $X$ is the component we are on. Then there exists a vertex in $Reach$ with unvisited neighbors. Your inner for loop will visit those neighbors and add them to $Next$, which will then be added to $Reach$. Otherwise, if $Reach = V(X)$, then $Reach = Next$ after the set of nested for loops terminates. And so on the next iteration of the loop, $LastCardinality = V(X)$. As $Reach$ will not increase in cardinality, the while loop will terminate, and so the algorithm will terminate.
To prove correctness, you need to show two things. The first is if $u, v \in V(X)$, then $v \in Reach$ upon termination of the algorithm. The second is that if $v \not \in V(X)$ but $u \in V(X)$, then the algorithm will not add $v$ to $Reach$.
So suppose $u, v \in V(X)$. We know there is such a $u-v$ path by the definition of connectivity. So you can prove by induction that such a path will be found. The proof of correctness Dijkstra's algorithm by induction is a good one to model.
Now consider if $u \in V(X)$ but $v \not \in V(X)$. Suppose the algorithm returns $v \in Reach$. This implies that there is a vertex $i$ with an edge connecting it to $v$ by your if statement in the second for loop. Since $i$ is being traversed, this implies $i$ is reachable from $u$ (ie., there is a $u-i$ path) by our previous inductive argument. And so by extension, there is a $u, v$ path, implying $v$ is part of component $X$. However, as $v \not \in V(X)$, there is no such $u-v$ path by definition of connectivity, a contradiction. Hence, the algorithm will not traverse $v$.