Given a digraph (directed graph) $G = (V, E)$, i'm looking for simply and reasonable ways of transforming $G$ into a DAG (directed acyclic graph) without "losing information".
One way of doing this is perhaps to partition $G$ into strongly connected components and consider the DAG induced on components (in a natural way...). I don't like this (for example, it would require the heavy-duty computation of the strongly connected components).
I've come up with something which goes a follows. For a positive integer $L$, define the digraph $G^{(L)} = (V, E^{(L)})$ by
$$E^{(L)} = \{(x, y) \in V^2 | \exists \text{ a path }x \rightarrow \ldots \rightarrow y \text{ of length } \ge L \}. $$ Thus $(x, y) \in E^{(L)}$ iff it is possible to go from $x$ to $y$ in $L$ or more steps. In particular, $G^{(1)} = \bar{G}$, the transitive closure of $G$, defined by saying $(x, y)$ is an edge of $\bar{G}$ iff there is a directed path from $x$ and $y$ in $G$.
Finally, define $l(G) = \inf \{L | G^{(L)}\text{ is a DAG}\}$ and $G^{(*)} := G^{(l(G))}$, the 'dagification' of $G$.
Now, for simplicity, assume that the set of nodes $V$ is finite. The following observations are immediate:
- $l(G)$ is always finite (in fact $l(G) \le |V|^2$)
- $G \subseteq G^{(1)} \supseteq G^{(2)} \supseteq \ldots \supseteq G^{(l(G))}$
- If $G = C_n$ is the directed cycle on $n$ nodes, then $l(G) = n$ and $G^{(*)} = \emptyset$.
Are there other interesting things which can be said ? For example,
- when do we have $l(G) = 1$ ? Or equivalently, when is $\bar{G}$ a DAG ?
- Can we (tightly) bound $l(G)$ in terms of $|V|$ and $|E|$ for a reasonably large / interesting family of digraphs ?
- Is there a simple relationship between $G$ and $G^{(*)}$.
- Is $G^{(*)}$ (conceptually) similar to any known objects in domain of markov chains ?
What do i mean by "information loss" ?
Well, I've not yet been able to precisely formulate what I mean by "information loss" in a reasonable way. However, I'll remark that squashing $C_n$ to $\emptyset$ is no loss of information since $C_n$ really contains no information except for the number $n$. Ultimately, for any digraph $G$, I'd want to transform it to a DAG in which cycles are shrunk to $\emptyset$, connected components are shrunk to single points, etc. As you see, it's not really yet clear in my mind... I hope to make this clearer very soon.
I think I can answer the first two parts of your question.
$l(G)=1$ if and only if $G$ is acyclic. Indeed, as you mentioned $G^{(1)}$ is the transitive closure of $G$. Strongly connected components in $G$ correspond to complete subgraphs in the transitive closure, so $G$ is acyclic if and only if the transitive closure of $G$ is acyclic.
For a relationship between $G^{L}$ and $G^{L+1}$, the best I’ve got is that $G^{L+1}$ is a subgraph of $G^{L}$. Indeed, if a path of length $\geq L+1$ exists between teo vertices, than surely a path of length $\geq L$ exists. I’m not sure this could be considered “nice” though.
You mention in your question that $l(G)\leq n^2$(where $n$ is the order of your graph), but it seems that we can get $l(G)\leq n$ since a path that uses every vertex is length $n-1$, and thus we do not have longer paths. Thus $G^{(n)}$ is empty for any digraph $G$ on $n$ vertices.