3 Graph Distance and Path Finding In some other applications, graphs are used to model distances; e.g., as in road networks and in workflow diagrams. The basic task then is to find shortest paths or routes, and the optimal distance. Brief outline of this lecture D • Distance in a graph, basic properties, BFS. • Weighted distance in digraphs; the problem of negative cycles and Bellman-Ford's algorithm. • Dijkstra's algorithm for the single-source shortest paths. • A sketch of some advanced ideas in practical path planning. Petr Hliněný, Fl MU Brno, 2017 1/23 Fl: MA010: Graph Distance and Paths ^3.1 Unit Distance in Graphs Recall that a walk of length n in a graph G is an alternating sequence of vertices and edges (vQ,e\,v\,e2,V2,... iCn^n) such that each ei has the ends i^-i, Vi. Definition 3.1. The distance dc{u,v) between two vertices u, v of a graph G is defined as the length of a shortest walk between u and v in G. If there is no walk between u,v, then we declare dc{u,v) = oo. □ Naturally, the distance between u, v equals the least possible number of edges travelled from u to v, and it is always achieved by a path, as shown in Lemma 2.6. Spec. dc(u,u) = 0. u »-•-—a v -•-—^ 3 Remark: Distance can be analogously defined for digraphs, using directed walks or paths. A more general view in Section 3.2 will consider also non-unit lengths of edges in G. Petr Hliněný, Fl MU Brno, 2017 2/23 ivaCf^of Cengtfi = procházka dítky, distance = vzdálenost Fl: MA010: Graph Distance and Paths ■r Triangle inequality u (m) t v w Lemma 3.2. The graph distance satisfies the triangle inequality: Vu,v,w£V(G) : dciu^v) + dc(v,w) > dc(u,w) .□ Proof. Easily; starting with a walk of length dc{u,v) from u to v, and appending a walk of length dc(v,w) from v to w, results in a walk of length dc{u,v) +dc(v,w) from u to w. This is an upper bound on the distance from u to w. □ □ Fact: The distance in an undirected graph is symmetric, i.e. dc{u,v) = dc(v,u) Petr Hliněný, Fl MU Brno, 2017 3/23 triangle inequality = trojúhelníková nerovnost Fl: MA010: Graph Distance and Paths Other related terms Definition 3.3. Let G be a graph. We define, with resp. to G, the following notions: • The excentricity of a vertex exc(v) is the largest distance from v to another vertex; exc(v) = max^y^) dc(v, x). □ • The diameter diam(G) of G is the largest excentricity over its vertices, and the radius rad(G) of G is the smallest excentricity over its vertices. 2 2 3 2 exc(-) It always holds diam(G) < 2 • rad(G). □ • The center of G is the subset U C V (G) of vertices such that their excentricity equals rad(G). diameter = průměr, radius = poloměr eccentricity = excentricita Petr Hlinený, Fl MU Brno, 2017 4/23 Fl: MAOIO: Graph Distance and Paths :fl—:- An excersise Example 3.4. What is the largest possible number of vertices a cubic (i.e., 3-regular) graph of radius 2 may have? □ Let G be the graph. First of all, the definition of radius tells us that, for some vertex u g V(G), all the vertices of G are at distance < 2 from u. □ Second, there can be < 10 such vertices by the degree-3 condition: □ And third, we are able (or lucky?) to fill in the remaining six edges (in order to get all the degrees equal to 3) as in the picture. Hence, 10 vertices is possible, and this is the answer. □ □ Remark: Note, moreover, that we have actually constructed a graph of diameter 2, which is a stronger requirement than radius 2. Petr Hliněný, Fl MU Brno, 2017 5/23 Fl: MA010: Graph Distance and Paths ^Simple Computation of Distance (BFS) Computing the (unit) distance from a given vertex uq to any other vertex of a graph is a matter of an extremely simple algorithm, based on BFS: Algorithm 3.5. Computing all distances from a starting vertex uq g V(G). □ For a given graph (or digraph) G and any uq g V(G), we run Algorithm 2.1 with the implementation of PROCESS(v; e) as follows (and with void PROCESS(e)): U as a fifo queue (BFS), and Petr Hliněný, Fl MU Brno, 2017 6/23 jednoduchý výpočet vzdálenosti přes (BfS Fl: MA010: Graph Distance and Paths ^BFS distance - the proof Theorem 3.6. Let uq,v,w be vertices of a connected graph G such that dc{uo, v) < dc{uo, w). Then the breadth-first search algorithm on G, starting from uq, discovers the vertex v before w. □ Proof. We apply induction on the distance dG(uo,v): If dG(uo,v) = 0, i.e. uq = v, then it is trivial that v is found first. So let dc(uo,v) = d > 0 and v' be a neighbour of v closer to uo, which means dc{uo, v') = d — 1. Analogously choose w' a neighbour of w closer to i^o- Then dG(uQ,w') = dG(u0,w) - 1 > dG(u0,v) - 1 = dG(uQ,v'), □ and so v' has been found before w' by the inductive assumption. Hence v' has been stored into U before w', and (cf. FIFO) the neighbours of v' (v among them, but not w) are discovered before the neighbours of w' (which include w). □ □ Corollary 3.7. The search tree of the BFS Algorithm 2.1 on G determines the distances from uo g V(G) to all vertices of G. Hence, Alg. 3.5 is correct, meaning that d±st(uo,v) = dG(uo,v) for all v g V(G). Petr Hliněný, Fl MU Brno, 2017 7/23 Fl: MA010: Graph Distance and Paths 3.2 Weighted Distance in Digraphs Recall (Section 2.3): A weighted (di)graph is a pair of a (di)graph G together with a weighting w of the edges by real numbers w : E(G) —>• R (edge lengths in this case). A positively weighted (di)graph (G,w) is such that w(e) > 0 for all edges e. □ Definition 3.8. Weighted distance (length) in a weighted (di)graph (G,w). The length of a weighted (dir.) walk S = (i>o, ei,vi, e2, i>2? • • • ? en, vn) in G is the sum ^g(^) = Mei) + Me2) H-----h w(en). The weighted distance in (G,w) from a vertex w to a vertex i; is dQ(u,v) = min{d^(S') : S is a (directed) walk from utov}.c u dist. 5 v For undir. graphs G, the definition considers the symmetric orientation of the edges. weighted distance (length) = vážená vzdálenost (dítka) Petr Hliněný, Fl MU Brno, 2017 8/23 Fl: MA010: Graph Distance and Paths ^ Basic facts • Weighted distance in a digraph (G,w) satisfies the triangle inequality. (The same statement and proof hold here as in Lemma 3.2.) □ • Ordinary graph distance is obtained for weights (G, w\) s.t. w\(e) = 1 for all e.u • If a weighted digraph (G,w) contains a cycle (a closed walk) of negative length, then the distance between a pair of vertices in G may not be defined ("-co"): -3 3 Proposition 3.9. If (G, w) is a weighted digraph containing no cycles of negative length (and hence no such closed walks), then □ • the weighted distance in (G, w) is always well defined, and □ • the weighted distance is achieved by a directed path in G. Petr Hliněný, Fl MU Brno, 2017 9/23 Fl: MA010: Graph Distance and Paths • By the previous facts, negative-length edges may cause huge problems with graph distance. So, why to consider them at all? (Do they make sense, anyway?) □ • For undirected graphs, the negative-length problem seems fatal, and hence we consider only positively weighted undirected graphs. For digraphs, though, negative-length edges might be useful to consider, as long as there is no cycle of negative length (Prop. 3.9). E.g., for DAGs. Petr Hlinený, Fl MU Brno, 2017 10/23 Fl: MA010: Graph Distance and Paths ^Bellman-Ford Algorithm Definition: A cycle of negative length in a weighted digraph is called a negative cycle.u Algorithm 3.10. Computing the distance or detecting a negative cycle. For a given weighted digraph (G,w), and a starting vertex uq G V(G), the task is to compute the distance dist[uo,v] = cIq(uo,v) from uq to any vertex v G V(G). initialize dist [u0, v] ^— oo, for all v G V(G); dist[uo,uo] ^—0; □ repeat |V(G)| — 1 times { foreach (e = tvG#(G)) { dist[uo,v] ^— min( dist [uq,v] , dist [uq,t] + w(e) ) ; (*) } } □ foreach (e = tvGE(G)) { if ( dist [uq , v] > dist [uq , t] +w (e) ) output c Error; a negative cycle exists in (G, w).J > output c Distances from uq in dist [uq, •] . j □ (One can also easily store the predecessors for the computed distances on line (*)...) negative cycle = záporný cyklus, predecessor = předchůdce Petr Hlinený, Fl MU Brno, 2017 11/23 Fl: MA010: Graph Distance and Paths Proof of the Bellman-Ford algorithm Proof. To claim that dist[uo,v] = d?Q(uQ,v) if there is no negative cycle in (G,w), and that a negative cycle is detected otherwise, we prove the following three steps. 1. At every step of Algorithm 3.10, it is dist[uo,v] > d^uq^v): This holds at the beginning, and follows trivially by induction on the number of elementary steps 'dist [uq,v] <— min( dist [uq,v] , dist [uo,t] + w(e)) '.□ 2. Assume there is no negative dir. cycle in (G,w). Let (cf. Prop. 3.9) Vk C V(G) be the subset of vertices v for which d?Q(uQ,v) is achieved by a dir. uq-v path with < k edges. Then, after iteration no. k of 'foreach (e = uvG E(G))\ the value of dist [u0,v] equals d^(^o, v) for all v G Vk'. □ Again, this trivially holds for k = 0 and follows easily by induction. □ 3. Let C C G be any directed cycle. If no negative cycle is reported at the end of Alg. 3.10, i.e. 'dist [uo,v] ^ dist [u0,u] +w(e)' for all e = uv G E(C) in the last phase, then C is not a negative cycle in (G,w): □ We have dist(uo,v) — dist(uo,t) < w(e) and summing these over all e G E(C) we get 0 < *52eeE(c) w(e)- Consequently, negative cycles in (G,w) are detected in the algorithm (but only detected, they cannot be easily constructed). □ Petr Hliněný, Fl MU Brno, 2017 12/23 Fl: MA010: Graph Distance and Paths ■r 3.3 Positive-length Shortest Paths In contrast to previous Algorithm 3.10, shortest paths may be computed much faster when all the edge lengths are positive (which is true, e.g., in practical routing problems). For the typical, so-called single-source positive-length shortest paths problem, a nearly optimal algorithm is the following traditional one. Dijkstra's algorithm: • For a given positively weighted digraph (G,w), and an arbitrary starting vertex uq £ V(G), the algorithms computes dist[uo,v] for all v G V(G). • In the graph-search scheme of Algorithm 2.1, one simply implements - 'choose (e,u) G U' by picking (e,u), e = tu, from U such that dist(uo,t) + w(tu) is minimized, □ - 'PROCESS(u; e=tu) ' as dist [u0,u] <- dist [u0,t] +w(tu) , □ - 'PROCESS(e)' as void, and - the search tree T then stores shortest paths from uq. □ • This algorithm works in the same way for undirected as for directed graphs. Petr Hliněný, Fl MU Brno, 2017 13/23 nejfcratší cesty z jednoho zdroje při kladných délkách Fl: MA010: Graph Distance and Paths ^"^^eh^ontair^ Algorithm 3.11. Dijkstra's for single-source shortest paths. For a positively weighted digraph (G,w), and a vertex uq G V(G), compute shortest paths predec[-] and distances dist[uo, •] in (G,w) from the source uq to all of G. initialize dist [u0, v] ^— oo, for all v G V(G); dist [u0,u0] ; □ while (U ^ 0) { choose u G U minimizing dist [uq ,u] ; for each (edge f starting in u) { v 4— the opposite vertex of '/ = uv'; if ( dist [uq,u] + u>(uv) < dist [uq,v] ) { U <- UU{v}; predec [v] <<— u; dist[uo,v] <<— dist [uq,u] +u>(uv) ; } > U <- U\{u>; > output ' distances in dist [. ], predecessors of shortest paths in predec [] '; * 'DijkstriLv a(goritkmus pro nejkratsi cesty Petr Hlinený, Fl MU Brno, 2017 14/23 Fl: MA010: Graph Distance and Paths Proposition 3.12. If the stack U is implemented as a minimum heap, then the number of steps performed by Algorithm 3.11 isO(\E(G)\ + \V(G) \ -\og\V(G)\). □ A vertex u G V(G) is called 11 relaxed1 after it is removed in 'U <— U\{u}' above. Theorem 3.13. Every iteration of Algorithm 3.11 maintains an invariant that • dist [uq , v] is the length of a shortest path from uq to v using only those internal vertices which are relaxed, and such a shortest path is stored in predec [.] .□ Consequently, all the distances and shortest paths to reachable vertices are correct. Proof: Briefly using mathematical induction: • In the first iteration of'while (U ^ 0 )', uq is chosen and the straight distances (edge lengths) to its neighbours are stored. □ • Subsequently, for every chosen vertex ^ in 'uGU minimizing dist [uq,u]', the current value of dist[u0,u] is optimal since no negative edges exist in (G,w) (and so every possible detour via non-relaxed vertices would only be longer).□ Then, all working distances and the shortest-paths record are properly updated (wrt. u) while "relaxing" u\ if ( dist [uo ,u] +w (uv) < dist [uo ,v] ) { predec [v] ^— u; dist[uo,v] ^— dist [uq ,u] +w (uv) ; Petr Hlinený, Fl MU Brno, 2017 15/23 Fl: MA010: Graph Distance and Paths □ In some settings, the following improved variant may be significantly more efficient in the single-pair shortest path problem in a digraph (G,w): □ • To find a shortest uq-vq path, run two instances of Algorithm 3.11 concurently: - A searches shortest paths from uq in (G,w), as usual, and □ - A*~ searches shortest paths from vq in (G^,w) where G^~ results from G by reversing all edges; e G E(G) to G E(G4~) such that w{e4~) = w(e).n • A and A*~ may simply alternate their iterations, or better; - minima u G U and u' G are chosen concurently, and the instance achieving smaller value among dist(uo,u) and dist^(vo,u') is run. □ • Termination condition; the whole algorithm stops when the search subtrees T and of A and A*~ meet each other. That is, whenever some vertex is relaxed in both A and A*~ .n At this moment, a shortest uq-vq path exists using only relaxed vertices. Though, this does not mean that every shortest uq-vq has to pass through the intersection of the two search trees - one still has to loop over the relevant edges uv to determine the minimum of dist(uo, v) + w(uv) + disť*~(vo,v). * obousměrný 'Dii fotrův algorithmus Petr Hlinený, Fl MU Brno, 2017 16/23 Fl: MA010: Graph Distance and Paths All-pairs Shortest Distances The last algorithm we are going to present in this section is extraordinarily simple, although rather slow since it has to compute all-pairs distances at once. □ Algorithm 3.14. Floyd-Warshall's algorithm for all-pairs distances For a positively weighted digraph (G, w), compute distances dist[-, •] between all pairs of vertices of G. initialize dist [u, v] ^— oo, for all u, v G V(G); foreach (uvG-E(G)) dist [u, v] 4— u>(uv) ; foreach ( t G V(G) ) { foreach (u,vG V(G) ) { dist[u,v] ^— min( dist [u,v] , dist [u,t]+dist [t ,v] ) ; > } output 5 The complete distance matrix of (G, w) /nd[,]'; The number of steps of this algorithm is 0(|F(G)|3), which is quite slow compared to repeated Dijkstra in the case of sparse graphs. □ Remark: Floyd-Warshall's algorithm has many shapes; it appears, e.g., in computation of the transitive closure and in the translation of a finite automaton to a regular expression. □ The algorithm is also related to matrix multiplication. Petr Hliněný, Fl MU Brno, 2017 17/23 Fl: MA010: Graph Distance and Paths Algorithm 3.14 is based on the following beautifully simple dynamic-programming idea: Computing all-pairs distances dynamically • Given is a weighted (di)graph (G, w) on n vertices; V(G) = {to, ti,..., tn_i}x Let distl(u,v) denote the length of a shortest im; walk S in G such that all vertices of S except the ends u, v are from the subset {to,..., U-i}. • For computing distl+1, the admissible walks are those as for dist1 plus those walks passing through ti ("u-ti-v"). □ Consequently distf^1 (u,v) = min (distl(u,v), distf(u,ti)-\-distf(ti,v))o and dist[u,v] = distn(u,v).L • This algorithm works correctly also with negative edge lengths, as long as there is no negative cycle (same as Bellman-Ford): Proposition 3.15. Algorithm 3.14 correctly computes distances between all pairs of vertices in a weighted (di)graph (G,w), provided that there is no negative cycle. Petr Hliněný, Fl MU Brno, 2017 18/23 Fl: MA010: Graph Distance and Paths ^3.4 Some Advanced Ideas in Path Finding Based on the above comparison of approaches, Dijkstra's algorithm seems to be the ultimate tool for practical path finding (or route planning) problems. • Being quite fast and, actually, "almost optimal" for the shortest path problem in weighted graphs, nDijkstra's algorithm turns out to be too slow for, e.g., practical route planning applications in navigation devices containing map data of tens or hundreds millions of edges. □ • So, what can be done better? □ • An answer lies in preprocessing of the graph: It is quite natural to assume that the graph (of a road network) is relatively stable, and hence it can be thoroughly preprocessed on powerful computers. □ However, what of the preprocessing results can be stored? It is, say, completely unrealistic to store all the optimal routes in advance... □ • Two perhaps simplest practically usable approaches will be briefly sketched next. Petr Hliněný, Fl MU Brno, 2017 19/23 Fl: MA010: Graph Distance and Paths ^^^irst^m^ilte^ia^ tial function to direct the search "towards the goal". Whenever we have a good "sense of direction" (e.g. in a topo-map navigation), A* can perform way much better! Algorithm A* • In a basic setting, A* re-implements Dijkstra with suitably modified edge costs on digraphs. □ • Letpv(x) be a potential function giving an arbitrary lower bound on the distance from x to the destination v (i.e., pv is admissible). E.g., in a map navigation, pv(x) may be the Euclidean distance from x to v. □ • Each oriented edge xy of the weighted graph (G,w) gets a new cost w'(xy) := w(xy) + pv(y) - pv(x) . The potential pv is consistent when all w'(xy) > 0, i.e. w(xy) > pv(x) — pv(y). The above Euclidean potential is always consistent. □ • The modif. length of any u-v walk S then is d^ (S) = d^S) -\-pv(v) - -pv(u), which is a constant difference from $q(S). □ Consequently, some S is optimal for the weighting w iff S is optimal for w'. Here the Euclidean potential "strongly prefers" edges in the destin. direction. Other (also preprocessed) potential functions are possible as well, though. Petr Hliněný, Fl MU Brno, 2017 20/23 Fl: MA010: Graph Distance and Paths Second, ... Idea of the "reach" parameter • It is based on a natural observation that for long-distance route planning, vaste majority of edges of real-world road maps are basically "irrelevant".□ Definition: Let SUiV denote a shortest walk from u to v in weighted G. For e G E(SUiV) let prefix(SUiV,e), suffix(SUiV,e) denote the starting (ending) segment of SUiV up to (after) e. nThe reach of an edge e G E(G) is given as reachG(e) = max { min( dQ(prefix(SUyV, e)), dQ(suffix(SUyV, e))) : Vu,v G V(G) A e G }.□ The reach of e mathematically quantifies (ir)relevance of e for route planning; the smaller reachc(e) is, the closer to the start or end of an optimal route e has to be. □ The immediate use of precomputed reach values is as follows: • We must use the bidirectional variant of Dijkstra or A*. □ • The line 'foreach (edge f starting in u)' in Algorithm 3.11 (in each direction) now takes only those edges f = uv such that reachc(f) > dist[uo,u . Petr Hliněný, Fl MU Brno, 2017 21/23 Fl: MA010: Graph Distance and Paths 3.5 Appendix: An example run of Dijkstra's alg. Example 3.15. An illustration run of Dijkstra's Algorithm 3.11 from u to v in the following graph. oo 1 00 V -----7*v / \ / \ oo f \ \ \ \ \ >- \ oo 3 v oo / \ \ \ \ 2 \ oo / \ / 2 \ / x n ' V--------V n i oo Petr Hliněný, Fl MU Brno, 2017 22/23 Fl: MA010: Graph Distance and Paths Petr Hliněný, Fl MU Brno, 2017 23/23 Fl: MA010: Graph Distance and Paths