A* Algorithm to find the optimal route from Köln to Karlsruhe.

315 Views Asked by At

The A*-Algorithm helps you find the optimal route from point A to B on a given graph plane. Let's say that I have the following problem where I have to go from Köln to Karlsruhe.

enter image description here

Without actually writing an algorithm, I tried to find the optimal route by finding the minimum distance from point A to B. I went from Köln to Frankfurt to Kaiserlautern to Ludwigshafen to Karlsruhe, where the total distance ends up being $912$ units. The quesion then asks that if a hypothetical algorithm was designed, which cities would it check? I think it should check all cities that're connected directly to the starting city (and then the cities that're on the optimal route), so Metz, Saarbrücken and Frankfurt. My answer is correct except for Metz which is not a city that the algorithm would "visit"?

So I have $3$ questions:

  1. Why would Metz not be correct?
  2. Is my distance value correct?
  3. What does the $h$-value tell us?

P.S. - I know this is basic theoretical CS stuff but I'm just in my first semester so please have mercy. :)

2

There are 2 best solutions below

3
On BEST ANSWER
  • The A* algorithm uses actual distances between neighbors, along with estimated distances ($h$ values), to find the shortest path. If the estimates are good, A* is guaranteed to find the shortest path.

  • By hand, you found the shortest path between Köln to Karlsruhe. Given good estimates, A* is guaranteed to find that path (or another path of equal length). The length of your path is 430, which you can compute by adding up the lengths of the roads between cities. I believe you've added in $h$ values by mistake, but $h$ values do not affect length. They are used to guide search, but they are not included when you compute path lengths. (I explain $h$ values below.)

  • The main thing to know about A* is that it maintains a list of partial paths, and that it decides which path to consider (extend) next based on estimated total path length to the goal.

    Estimated total path length is the length of the path so far, plus the estimated distance remaining to the goal. The $h$ values tell you the estimated distance from each city to the goal.

  • Initially, A* will compute estimated total path length from Köln to its neighbors to Karlsruhe. Those estimates are:

    • Köln-Frankfurt 190 + 125 = 315
    • Köln-Saarbrücken 260 + 105 = 365
    • Köln-Metz 290 + 162 = 452
  • So naturally A* explores paths from Köln through Frankfurt, first. If the estimates are good, then paths with the smallest estimated total length will tend to have the smallest actual length when they reach the goal.

  • Next, A* computes costs for Köln-Frankfurt-Koblenz and the other neighbors. It also remembers the other paths it has computed: Köln-Saarbrücken and Köln-Metz. It will consider whichever of the paths has the shortest length among paths it has computed.
  • The reason A* never visits Metz is because the path to Köln-Metz never has the shortest estimated path length. You can see this by computing the estimated path lengths of the true shortest path:

    Köln-Frankfurt, Köln-Frankfurt-Kaiserlautern, Köln-Frankfurt-Kaiserlautern-Ludwigshafen, Köln-Frankfurt-Kaiserlautern-Ludwigshafen-Karlsruhe

    All of these have lower estimates than Köln-Metz, so the goal will be found before Köln-Metz will be visited.

0
On

The $h$-values are reasonable lower bounds for the distance to the final destinations. (The canonical example is distance measured along a straight line.) You do not include the $h$-values in the total cost, so the answer to your question 2 is no, your distance value is not correct. You get the actual total distance by adding together the distances between successive cities along your chosen path. The $g$-values are updated during the algorithm, and at each step, they are the shortest distances from the starting point found so far. So, once the value of $g(n)$ has been found, the value $f(n)=g(n)+h(n)$ is a lower bound for the total cost of a path going through node $n$. At each step of the algorithm, you "check" the node which has the smallest $f$-value among the nodes that haven't been checked yet. Checking a node means updating the $g$-values of its neighbors. When the destination is checked, the algorithm finishes. Metz is never checked because its $f$-value is larger than the cost of the optimal path, which equals the $f$-value of the destination.