I want to perform graph isomorphism tests for a very long random walk with fixed windows. That is given a target graph, say a triangle, I want to find how many consecutive 3 nodes in the random walk induce a triangle.
Graph isomorphism test is very costly and there may be repetitive graph patterns appearing in the random walk. Thus, it is expensive to do the isomorphism test on-the-fly when the random walk is simulated.
Hence, I want to store the random walk first. Afterwards, I want to use some pruning techniques to reduce the number of isomorphism test.
So my question is that how to store a very long random walk with very low memory cost? The naive way is just to store the whole sequence of the random walk, which will cause high memory usage. Is there any better cache technique to do that?
How big is the graph where you perform the random walk? If it is a random walk you can't compress very much without loosing information. If you have an $n$-step random walk you need to store $n$ pieces of information.
The naive way would be to store the vertex numbers. Alternatively you could define an ordering of the adjacent vertices for every single vertex and then only store the position in the list of adjacent vertices. So instead of storing $n$ numbers of maximal size the total number of vertices you only store $n$ numbers of maximal size the maximal degree in the graph.
Edit: A simple example, suppose your graph is the complete graph on $4$ vertices, numbered $1$ to $4$. The naive way to write down a random walk would be to list the vertices you visited. So a two step random walk would consist of $2$ numbers in the range $1$ to $4$. Start at $1$, then the random walk is for example $4,2$.
What I proposed was to instead just write down the rank among the adjacent vertices. So for the first step, the neighbours are $2,3,4$, the random walk goes to vertex $4$, that is the third one in the ordering, write down $3$. Next you are at vertex $4$, neighbors are $1,2,3$, random walk goes to $2$, that is the second one, so write down $2$. The maximum degree in the graph is $3$, so instead of writing down numbers in the range $1$ to $4$ we only have to write down numbers in the range $1$ to $3$. That doesn't really save much in this example but for a large graph with small maximum vertex degree it could save a bit.