3 Graph Distance and Path Finding In some other applications, graphs are used to model distances; e.g., as in road networks and in workflow diagrams. The basic task then is to find shortest paths or routes, and the optimal distance. Brief outline of this lecture • Distance in a graph, basic properties, BFS. • Weighted distance in digraphs; the problem of negative cycles and Bellman-Ford's algorithm. • Dijkstra's algorithm for the single-source shortest paths. • A sketch of some advanced ideas in practical path planning. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths 3.1 Unit Distance in Graphs Recall that a walk of length n in a graph G is an alternating sequence of vertices and edges (vq, e\, f 1, e2, i>2, ■ ■ ■, en, vn) such that each has the ends t>i_i, t>i. Definition 3.1. Distance dc(u,v) between two vertices u,v of a graph G is defined as the length of the shortest walk between u and v in G. If there is no walk between u, v, then we declare dc(u, v) — oo. □ Naturally, the distance between u,v equals the least possible number of edges travelled from u to v, and it is always achieved by a path from Lemma 2.6. Spec. do{u,u) = 0. Remark: Distance can be analogously defined for digraphs, using directed walks or paths. A more general view in Section 3.3 will consider also non-unit lengths of edges in G. wall^of hngth = procházka déíky, distance = vzdálenost Petr Hlinený, Fl MU Brno, 2014 2/33 Fl: MA010: Graph Distance and Paths Triangle inequality u ■ l 1 V iv \ 'x — (Si i *■ — — <8J • W Lemma 3.2. The graph distance satisfies the triangle inequality: \fu,v,w G V(G) : dc(u,v) + dc(v,w) > dc(u,w) .n Proof. Easily; starting with a walk of length dc(u,v) from u to v, and appending a walk of length dc(v,w) from v to w, results in a walk of length dc(u,v) + dc(v,w) from u to w. This is an upper bound on the distance from u to w. □ C Fact: The distance in an undirected graph is symmetric, i.e. dc(u,v) — dc(v,u). Petr Hliněný, Fl MU Brno, 2014 d nerovnost Fl: MA010: Graph Distance and Paths ■r Other related terms Definition 3.3. Let G be a graph. We define, with resp. to G, the following notions: • The excentricity of a vertex exc(t>) is the largest distance from v to another vertex; exc(t>) — ma.yixev{G) da{v,x). □ • The diameter d ia m (G) of G is the largest excentricity over its vertices, and the radius rad(G) of G is the smallest excentricity over its vertices. exc(-) 2 2 It always holds diam(G) < 2 • rad(G). □ • The center of G is the subset U C V (G) of vertices such that their excentricity equals rad(G). -acCius = poComh Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths An excersise Example 3.4. What is the largest possible number of vertices a cubic (i.e., 3-regular) graph of radius 2 may have? □ Let G be the graph. First of all, the definition of radius tells us that, for some vertex u e V(G), all the vertices of G are at distance < 2 from u. □ Second, there can be < 10 such vertices by the degree-3 condition: And third, we are able (or lucky?) to fill in the remaining six edges (to get all the degrees equal 3) as in the picture. Hence, 10 vertices is possible, and this is the answer. □ C Remark: Note, moreover, that we have actually constructed a graph of diameter 2, which is a stronger requirement than radius 2. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths ^3.2 Simple Computation of Distance (BFS) Computing the (unit) distance from a given vertex uo to any other vertex of a graph is a matter of an extremely simple algorithm, based on BFS: Algorithm 3.5. Computing all distances from a starting vertex uo e V(G). □ For a given graph (or digraph) G and any uo e V(G), we run Algorithm 2.1 with the implementation of PR0CESS(v; e) as follows (and with void PRQCESS(e)): initialize dist [uo, v] ^— oo, for all v e V(G); dist [u0,u0] ^—0; u0 PRQCESS(v;e) { u ^— the starting vertex of "e — uv"; dist[uo,v] -h- dist [uo,u] +1; } Petr Hliněný, Fl MU Brno, 2014 'jeínoíucký mjpočet vzídímosú přes IJS * Fl: MA010: Graph Distance and Paths BFS distance - the proof Theorem 3.6. Let uo,v,w be vertices of a connected graph G such that dc(uo, v) < dc(uo,w). Then the breadth-first search algorithm on G, starting from u$, finds the vertex v before w. □ Proof. We apply induction on the distance dc(uo,v): If dc(uo,v) — 0, i.e. uo — v, then it is trivial that v is found first. So let dc(uo,v) — d > 0 and v' be a neighbour of v closer to uo, which means dc(uo,v') — d— 1. Analogously choose w' a neighbour of w closer to uq. Then dG{u0,w') > dG(u0,w) - 1 > dG(u0,v) - 1 = dG(uo,v'). and so v' has been found before w' by the inductive assumption. Hence v' has been stored into U before w', and (cf. FIFO) the neighbours of v' (v among them, but not w) are found before the neighbours of w' (such as w). □ C Corollary 3.7. The search tree of the BFS Algorithm 2.1 on G determines the distances from uo e V(G) to all vertices of G. Hence, Alg. 3.5 is correct, meaning that dist(wo,f) — dG(uo,v) for all v e V(G). Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths 3.3 Weighted Distance in Digraphs Recall (Section 2.3): A weighted graph is a pair of a graph G together with a weighting w of the edges by real numbers w : E(G) —>• -R (edge lengths in this case). A positively weighted graph (G,w) is such that w(e) > 0 for all edges e. □ Definition 3.8. Weighted distance (length) in a weighted (di)graph (G,w). The length of a weighted (dir.) walk S — t>o, e±, t>i, e2, t>2, ■ ■ ■, en,vn in G is the sum rfg(S') — w(ei) + w(e2) +----h w(e„) .□ The weighted distance in (G, w) from a vertex h to a vertex v is dist [uo ,u]+w(e) ) output "Error; a negative cycle exists in (G,w)." } output "Distances from uo in dist [uo, •] ." □ (One can also easily store the predecessors for the computed distances on line (*)...) Petr Hliněný, Fl MU Brno, 2014 negative cycíe = záporný cyklus, predecessor = předchůdce Fl: MA010: Graph Distance and Paths Proof of the Bellman-Ford algorithm Proof. To claim that dist[uo,v] — c[q(uo,v) if there is no negative cycle in (G,w), and that a negative cycle is detected otherwise, we prove the following three steps. 1. At every step of Alg. 3.10, it is dist[uo,v] > c[q(uo,v): □ This holds at the beginning, and follows trivially by induction on the number of elementary steps "dist[un,v] <— min( dist [uo, v] , dist [un,u] + w(e) )".□ 2. Assume there is no negative dir. cycle in (G,w). Let (cf. Prop. 3.9) Vi C V(G) be the subset of vertices v for which c[q(uo,v) is achieved by a dir. uq-v path with < i edges. Then, after iteration no. k of "foreach (e=uvG E(G))", the value of dist [uo,v] equals dg(un, v) for all v G Vk- □ Again, this triv. holds for k — 0 and follows easily by induction. □ 3. Let C C G be any directed cycle. If "dist[uo,v]^ dist[uo,u]+w(e)" for all e — uv G E(C), then C is not a negative cycle in (G,w): □ We have dist(uo, v) — dist(uo,u) < w(e) and summing these over all e G E(C) we get 0 < 2~^ee_e(C) w(e)- Consequently, negative cycles in (G,w) are detected in the algorithm (but only detected, they cannot be easily constructed). C Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths 3.4 Positive-length Shortest Paths In contrast to previous Algorithm 3.10, shortest paths may be computed much faster when all the edge lengths are positive (which is true, e.g., in practical routing problems). For the general single-source positive-length shortest paths problem, a nearly optimal algorithm is the following traditional one. Dijkstra's algorithm: • For a given positively weighted digraph (G,w), and an arbitrary starting vertex uo e V(G), the algorithms computes dist[uo,v] for all v e V(G). □ • In the graph-search scheme of Algorithm 2.1, one simply implements - "choose (e,u) e U" by picking (e,u), e — tu, from U such that dist(uo, t) + w(e — tu) is minimized, □ - "PROCESS(u; e=tu)" as dist [u0,u] <- dist [u0, t] +w(tu) , □ - "PROCESS(e)" as void, and - the search tree T storing shortest paths from uq. □ • This algorithm works in the same way for undirected as for directed graphs. Petr Hliněný, Fl MU Brno, 2014 * nejfcrotší cesty z jednoho zdroje při kladných délkjích * Fl: MA010: Graph Distance and Paths A self-contained exposition of Dijkstra's algorithm is quite simple: Algorithm 3.11. Dijkstra's for single-source shortest paths. For a positively weighted digraph (G,w), and a vertex uo e V(G), compute shortest paths predec[-] and distances dist[uo, ■] in (G,w) from the source uo to all of G. initialize dist [uo, v] «— oo, for all v e V(G); dist [u0,u0] «— 0; U <- {u0>; □ while (U ^ 0) { choose u€U minimizing dist [uo,u] ; foreach (edge f starting in u) { v «— the opposite vertex of "f — uv"; if ( dist [uo,u] + w(uv) < dist [uo,v] ) { U <- UUW; predec [v] <— u; dist[uo,v] <— dist [uo,u] +w(uv) ; } } U ; } output ' distances in dist [. ], predecessors in shortest paths in predec []'; * Thjkstrilv aigorithmus pro tiej^ratsi cestij * Petr Hliněný, Fl MU Brno, 2014 14/33 Fl: MA010: Graph Distance and Paths Proposition 3.12. If the stack U is implemented as a minimum heap, then the number of steps performed by Algorithm 3.11 is 0(\E(G)\ + \V(G)\ ■ log\V(G)\). c A vertex u e V(G) is called "relaxed' after it is removed in "U «— U\{u}" above. Theorem 3.13. Every iteration of Algorithm 3.11 maintains an invariant that • dist [uo, v] is the length of a shortest path from uo to v using only those internal vertices which are relaxed, and such a shortest path is stored in predec [. ] .□ Consequently, all the distances and shortest paths to reachable vertices are correct. Proof: Briefly using mathematical induction: • In the first iteration of "while (U ^ 0 )", uo is chosen and the straight distances (edge lengths) to its neighbours are stored. □ • Subsequently, for every chosen vertex u in "u€U minimizing dist[uo,u]", the current value of dist[uo,u] is optimal since no negative edges exist in (G,w) (and so every possible detour via non-relaxed vertices would only be longer).□ Then, all working distances and the shortest-paths record are properly updated (wrt. u) while "relaxing" u: if ( dist [uo ,u]+w(uv) < dist [uo ,v] ) { predec[v] ^— u; dist[uo,v] ^— dist [uq ,u]+w(uv) ; Z Petr Hliněný, Fl MU Brno, 2014 15/33 Fl: MA010: Graph Distance and Paths Bidirectional Dijkstra's algorithm In some settings, the following improved variant may be significantly more efficient in the single-pair shortest path problem in a digraph (G,w): □ • To find a shortest Mo-fo path, run two instances of Algorithm 3.11 concurently: — A searches shortest paths from uo in (G,w), as usual, and □ — A*~ searches shortest paths from vq in (G^,w) where G^ results from G by reversing all edges; e G E(G) to G E(G^) such that w(e*~) — w(e).o • A and A*~ may simply alternate their iterations, or better; — minima u G U and u' G U*~ are chosen concurently, and the instance achieving smaller value among dist(uo,u) and dist^(vo,u') is run. □ • Termination condition; the whole algorithm stops when the search subtrees T and of A and A*~ meet each other. That is, whenever some vertex is relaxed in both A and A*~. Petr Hliněný, Fl MU Brno, 2014 lousmémý 'Dij^strďu olgoňtfimus * Fl: MA010: Graph Distance and Paths ^All-pairs Shortest Distances The last algorithm we are going to present in this section is extraordinarily simple and beautiful, although rather slow since it can compute only all-pairs distances at once. □ Algorithm 3.14. Floyd-Warshall's algorithm for all-pairs distances For a positively weighted digraph (G, w), compute distances dist[-, ■] between all pairs of vertices of G. initialize dist [u, v] «— oo, for all u,v e V(G); foreach (uvG-E(G)) dist [u, v] <— to(uv) ; foreach (te V(G) ) { foreach (u,v€ V(G) ) { dist[u,v] <— min( dist [u,v] , dist [u,t]+dist [t ,v] ) ; } } output ' The complete distance matrix of (G, w) in d [, ] '; The number of steps of this algorithm is 0(|y(G)|3), which is quite slow compared to repeated Dijkstra in the case of sparse graphs. □ Remark: Floyd-Warshall's algorithm has many shapes; it appears, e.g., in computation of the transitive closure and in the translation of a finite automaton to a regular expression. □ The algorithm is also related to matrix multiplication. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths Algorithm 3.14 is based on the following beautifully simple dynamic-programming idea: Computing all-pairs distances dynamically • Given is a weighted (di)graph (G, w) on n vertices; V(G) — {to, ti,..., t„_i}.n Let distl(u,v) denote the length of a shortest u-v walk S in G such that all vertices of S except the ends u, v are from the subset {to, ■ ■ ■, U-i}. □ • For computing distl+1, the admissible walks are those as for dist1 plus those walks passing through ti ("u-ti-v"). □ Consequently disf+1(u,v) — min (dist{u,v), dist{u,ti) + dist{ti,v))n and dist[u,v] — distn(u,v).o • This algorithm works correctly also with negative edge lengths, as long as there is no negative cycle (same as Bellman-Ford): Proposition 3.15. Algorithm 3.14 correctly computes distances between all pairs of vertices in a weighted (di)graph (G, w), provided that there is no negative cycle. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths 3.5 Some Advanced Ideas in Path Finding Based on the above comparison of approaches, Dijkstra's algorithm seems to be the ultimate tool for practical path finding (or route planning) problems. • Being quite fast and, actually, "almost optimal" for the shortest path problem in weighted graphs, Dijkstra's algorithm turns out to be too slow for, e.g., practical route planning applications in navigation devices containing map data of tens or hundreds millions of edges. □ • So, what can be done better? □ • An answer lies in preprocessing of the graph: It is quite natural to assume that the graph (of a road network) is relatively stable, and hence it can be thoroughly preprocessed on powerful computers. □ However, what of the preprocessing results can be stored? It is, say, completely unrealistic to store all the optimal routes in advance... □ • Two perhaps simplest practically usable approaches will be briefly sketched next. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths First, an alternative to Dijkstra's alg. is the Algorithm A*, which uses a suitable potential function to direct the search "towards the goal". Whenever we have a good "sense of direction" (e.g. in a topo-map navigation), A* can perform way much better! Algorithm A* • In a basic setting, A* re-implements Dijkstra with suitably modified edge costs on digraphs. □ • Let pv(x) be a potential function giving an arbitrary lower bound on the distance from x to the destination v (i.e., pv is admissible). E.g., in a map navigation, pv(x) may be the Euclidean distance from x to v. □ • Each directed edge xy of the weighted graph (G,w) gets a new cost w'(xy) = w(xy) + pv(y) - pv(x). The potential pv is consistent when all w'(xy) > 0, i.e. w(xy) > pv(x) — pv(y). The above Euclidean potential is always consistent. □ • The modif. length of any u-v walk S then is cIq (S) — d,q(S) +pv(v) —pv(u), which is a constant difference from rfg(S'). □ Consequently, some S is optimal for the weighting w iff S is optimal for w'. Here the Euclidean potential "strongly prefers" edges in the destin. direction. Other (also preprocessed) potential functions are possible as well, though. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths Second, The idea of a "reach" • It is based on a natural observation that for long-distance route planning, vaste majority of edges of real-world road maps are basically irrelevant.□ Definition: Let Su,v denote a shortest walk from u to v in weighted G. For e G E(SUtV) let prefix(SUtV,e), suffix(SUtV,e) denote the starting (ending) segment of Su,v up to (after) e. nThe reach of an edge e G E(G) is given as reacho(e) — max { min( d,Q(prefix(SUtV, e)), d,Q(suffix(SUtV, e))) : Vu,v G V(G) A e G E(SU,V) }.□ The reach of e mathematically quantifies (ir)relevance of e for route planning; the smaller reacha(e) is, the closer to the start or end of an optimal route e has to be. □ The immediate use of precomputed reach values is as follows: • We must use the bidirectional variant of Dijkstra or A*. □ • The line "foreach (edge f starting in u)" in Algorithm 3.11 (in each direction) now takes only those edges f — uv such that reacho(f) > dist[uo,u]. Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths 3.6 Appendix: An example of a run of Dijkstra's alg. Example 3.15. An illustration run of Dijkstra's Algorithm 3.11 from u to v in the following graph. OO 1 00 V ^----- 2 / \ ^ ' N / \ / oo «c 5 y / 2 \ 2 \ \ ^ \ 1 / v3 3 r \ \ 2 \ / \ / / \ / 2 \ 0 / \ / V--------^ ÍX 1 oc Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths Petr Hliněný, Fl MU Brno, 2014 Fl: MA010: Graph Distance and Paths