First of all, I am trying to calculate the Betweenness Centrality of a fully connected directional graph with edge weights with N=3015 nodes. Matlab can do this in about 30 seconds and Python igraph can do it in a minute while Networkx takes hours as it is written in python and so I have abandoned it.
Second, I need to get a list of all of the shortest paths (those that minimize weights along path) between all pairs of nodes. In matlab there is a function to do this called shortest path. There is also a similar function in igraph called get_shortest_paths. I can only get one path back from each call.
So I loop over all $N^2$ pairs of nodes and extract the shortest path, store it in an array and then continue.
The problem is that while I can get betweenness centrality in under a minute, extracting all of the shortest paths takes hours in both matlab and igraph.
This seems strange given that betweenness centrality is based on the calculation of all of the shortest paths. So why does the extraction of the shortest paths take so much longer ?
This is because you’re computing everything from scratch for each pair, whereas an efficient computation of the shortest paths between all pairs reuses a lot of information. In MATLAB you can use
shortestpathtreeto get all shortest paths from a given vertex in one go. In igraph, the methodget_shortest_pathsthat you’re using accepts the valueNonefor the parametertoto indicate that the shortest paths to all vertices should be calculated. Since you write that you need to get a list of all the shortest paths, you should also take a look atget_all_shortest_paths(which also acceptsNoneforto). The documentation isn’t a model of clarity; if I’m interpreting it correctly, the difference is thatget_shortest_pathscalculates a single shortest path per target vertex andget_all_shortest_pathscalculates all of them (so the plural ‘s’ inget_shortest_pathsseems to refer to the potentially plural target nodes, in contrast toget_shortest_path, which only accepts a single target node).I would expect that computing all shortest paths for all vertex pairs in one go would be even more efficient, but this should already significantly reduce the total computation time.