Shortest path between wikipedia articles

1.3k Views Asked by At

I'm trying to figure out whether it is possible (and if so how) to find the shortest path inside a network from one node to another. I know that there are different possible algorithms to do that the most prominent being probably the A* search algorithm. I know that this algorithm uses heuristics to make assumptions on the chances that one way would be shorter than the other. On every node the algorithm passes it needs a function that to get a probability of whether or not one path is shorter than the other ones. But what if no such assumptions can be made?

A bit more concrete, I want to find the shortest path from one wikipedia article to another one only using the links inside the articles. Is this possible without checking every possible order of articles? In my opinion every node has the same weight and every possible edge/link from a node has the same probability of ending in the shortest path. How can I possibly make any decisions which path to take?

I'm no mathematician but more of a designer/programmer so if you have any clue how to help me solve this problem please use a "maths-for-dummies" language.

2

There are 2 best solutions below

1
On

Breadth first search is how I would expect to handle this.

0
On

Thanks for your answers! I kept looking for solutions and found something very useful on stackoverflow. The answer is actually exactly what I expected. What I need is some function which makes an prediction for every node that is connected to the current on how long (relatively speaking) the total path will be. The post suggests keyword mapping or some other kind of guessing the similarity of two articles. Any other suggestions?