5 MST and Greedy Algorithms One of the traditional and practically motivated problems of discrete optimization asks for a "minimal interconnection" of a given set of terminals (meaning that every pair will be connected via some path). Imagine, for instance, electric powerline or computer networks. _ This problem can be formally captured as finding the minimal connected subgraph of a given (weighted) graph — a minimum spanning tree of the graph. Brief outline of this lecture • Minimum spanning tree (MST) problem; Kruskal's greedy algorithm and Jarnfk's algorithm. • Principles of greedy algorithms - examples, and when greedy solutions do not work optimally. • Matroids and abstract greedy optimization. V Petr Hliněný, FI MU Brno FI:MA010: MST and Greedy Algorithms 5.1 Finding minimum spanning trees Problem 5.1. Minimum spanning tree (MST) problem Given a weighted graph G, w with nonnegative edge weights w; the problem is to find a spanning tree T in G that minimizes the total weight. Formally MST = min \ Yl w(e) sp. tree TCO \e£E(T) In this problem, the spanning tree T is a minimal interconnection with respect to the cost function w. Such as.. . 3 4 3 3 4 3 *.............44 2 Petr Hliněný, FI MU Brno FI: MA010: MST and Greedy Algorithms _ 1 2 1 2 1 4 2 1 An obvious approach 3 4 3 *.............*- 14 2 Algorithm 5.2. "Greedy" algorithm for the MST problem. Given is a weighted graph G, w with nonnegative edge weights w. - Sort the edges of G according to nondecreasing weights, i.e. w(ei) < w(e2) < • • • < w(em). c - Start with an empty edge set E(T) = 0 for the spanning tree. - For i = 1, 2,..., m, consider the edge e^: If E(T) U (ejj does not make a cycle in G, then add ej to E(T). Throw ej "away" otherwise. c - At the end, E(T) is the edge set of a min. spanning tree T in weighted G, w. Petr Hliněný, FI MU Brno_3_FI: MA010: MST and Greedy Algorithms 3 4 3 1 2 1 2 t 1 4 2 We illustrate the algorithm on our example graph. (With edge weights sorted as 1,1,1,1, 2, 2, 2, 2, 3, 3, 3,4,4 .. .) % \ 3 % 1 \ % % 1 4 2 3 4 3 » & & & « % % 3 % i \ % ,-h 343 ♦-tv- 21 34 t- 42 3 21 4- 42 The resultion MST has total weight 1 + 2 + 2 + 3+1 + 1 + 2= 12. Notice that the solution (a spanning tree) is not unique, there could be several spanning trees of the same minimum total weight! Petr Hliněný, FI MU Brno FI:MA010: MST and Greedy Algorithms 3 4 3 1 2 1 1 2 1 4 2 1 Proof of Algorithm 5.2: Let E(T) be the edge set computed in Algorithm 5.2 and let the edges be sorted as w(ei) < w(e2) < • • • < w(em). Among all minimum-weight spanning trees in G, w, we select one T0 which is identical to T on the longest possible prefix of the edge ordering ei, e2,..., em. If T0 = T, then we are done. c So assume, for a contradiction, that T0 = T. Let j > 0 be such that T0 and T agree on the first j — 1 edges e1,..., ej-1, but they disagree on ej. This means ej G T but ej G To, since the converse cannot obviously happen. By Corollary 4.5, the subgraph induced on the edges E(To) U {ej} contains exactly one cycle C. Since C % T, there exists an edge ek of C such that ek G E(T), and k > j by our choice of j. Then w(ek) > w(ej), and the spanning tree on the edges (E(T0) \ {ek}) U {ej} ("replace" ek with ej) costs no more than T0, and hence it should have been chosen in place of T0, a contradiction. □ Petr Hliněný, Fl MU Brno_5_FhMAOlO: MST and Greedy Algorithms Other possible algorithms Though the above basic greedy algorithm have been firstly explicitly formulated by Kruskal, other greedy approaches to MST have been found before, noticeably, first by Czech mathematicians. We briefly mention them here: Algorithm 5.3. Jarnik's greedy algorithm for MST. The edges are not sorted globally in advance, but the spanning tree "grows" from one vertex, at every step choosing the least edge leaving the current fragment of the spanning tree. c Remark: This algorithm is very practical, and perhaps mostly used in practice. Not many users, however, know its origin and attribute this algorithm to Prim who rediscovered it 30 years after Jarník. Algorithm 5.4. Boruvka's algorithm for MST (a sketch). This is a more complicated algorithm which applies the greedy approach "in parallel" from all vertices of the graph at once. . . Petr Hliněný, FI MU Brno FI:MA010: MST and Greedy Algorithms 5.2 General greedy algorithms Perhaps the simplest method of "solving" discrete optimization problems can be summarized as follows: Always choose what is currently (i.e. locally) the best available local solution. (Hoping that this will eventually lead to a global optimum.) c This approach is generally called a greedy algorithm. Its core attributes are • In succesive steps, choose the locally best element as the next one for the solution. • This approach requires a suitable ordering on the elements (not necessarily determined in advance). c • The run and the success of such a greedy algorithm strongly depend on this chosen ordering. Although greediness is not always polite and successful in real life, surprisingly this approach works great for many problems of discrete optimization, like for the MST problem or others. . . Petr Hliněný, Fl MU Brno_7_FhMAOlO: MST and Greedy Algorithms Problem 5.5. A job assignment problem Assume a given list of jobs, each one having specified release and end time (i.e. represented by intervals on the time axis). The task is to assign workers to these jobs such that the total number of workers is minimized. All the workers and jobs are uniform. An example of the input for this problem, see the following: 4 •-• 1 •-• 3 •-• 2 •-• •-• 2 1 •-• How many workers are needed to satisfy all these jobs? nQuite easily, 4 are enough (see the labeling), but why all 4 are required? Petr Hliněný, Fl MU Brno_8_FhMAOlO: MST and Greedy Algorithms Algorithm 5.6. Greedy algorithm for the job assignment problem. Problem 5.5 can be solved by the following greedy assignment: 1. We sort the jobs by their release times. 2. For every subsequent job, we assign the least available worker. □ Proof: Let this algorithm use k workers in total. We easily prove that no less than k workers suffice. So, at what moment the worker of number k started to work? At that moment when all workers 1, 2,..., k — 1 have been busy with other jobs. Hence k jobs overlap at that moment, proving the required. □ □ Is it that we can always assign greedily, regardless of what job ordering we choose? What if we sort the jobs by their lengths (from the longest one)? 5 * • • • 3 2 * • 2 • • 3 • • • • 4 1 • • As we can see, we need 5 workers in such a case! V Petr Hliněný, FI MU Brno FI:MA010: MST and Greedy Algorithms 5.3 The notion of a matroid Definition 5.7. A matroid on a ground set X, denoted by M = (X,N), is a set system N of subsets of X, satisfying the following three points: 1. 0GN 2. A G N and B C A == B gN 3. A, B G N and |A| < |B| = By G B \ A : A U {y} G N The sets from N are independent sets. The others are dependent. The inclusion-wise maximal independent sets are called bases of the matroid. c The most important in the matroid definition is point three. A natural example of a matroid is given by the linearly independent sets of vectors. c Lemma 5.8. All matroid bases have the same cardinality. V Petr Hlineny, FI MU Brr FI: MA010: MST and Greedy Algorithms Cycle matroid of a graph Another natural example of matroids comes from graphs and their spanning trees: Definition: An edge subset F C E(G) is acyclic if the graph formed by V(G) and the edges from F contains no cycle. Lemma 5.9. A forest on n vertices with c components has exactly n — c edges. c Lemma 5.10. Let F1,F2 be acyclic subsets of edges in a graph G such that |Fi| < |F21. Then there exists an edge / G F2 \ F\ such that F\ U {/} is also acyclic. Definition: By Lemma 5.10, the system of all acyclic edge subsets in an arbitrary graph G forms a matroid. This is called the cycle matroid of the graph G. The cycles of G form the minimal dependent sets of this matroid. V Petr Hiineny, FI MU Brr FI: MA010: MST and Greedy Algorithms How the cycle matroid of a graph corresponds to a vector matroid. .. [? d [ c b a o Petr Hiineny, FI MU Brr FI: MA010: MST and Greedy Algorithms Abstract greedy algorithm Notice that, concerning algorithmic use of matroids, giving the whole matroid (i.e. listing all its expon. many independent sets) on the input is infeasible. Hence a matroid is commonly described via an external function testing, say, the independent sets. c Algorithm 5.11. Finding the least independent set - greedily. input < a set X with a weight function w : X — R, a matroid on X determined via an external function independent (Y); c sort X=(x[1],x[2],...,x[n]) such that w[x[1]]<=...<=w[x[n]]; B = 0; for (i=1; i<=n; i++) if (independent(BU{x[i]|)) output > a basis B with the least sum of w-weights. c Theorem 5.12. Algorithm 5.11 (the greedy algorithm) on a given ground set X and weight function w : X — R, and for a given matroid N on X, correctly finds a basis B of N of the least weight. B = B U{x[i]}; V Petr Hlineny, FI MU Brr FI: MA010: MST and Greedy Algorithms 5.4 On (in)correctness of greedy algorithms On the most abstract level, correctness of the greedy algorithm is tied with matroids as follows. Theorem 5.13. Let X be a ground set and N a system of its subsets such that the items (1),(2) of Definition 5.7 hold. If Algorithm 5.11 correctly finds the optimal independent set from N for any weight function w : X — R, then also (3) holds for N, and so N is a matroid on X. c Proof by means of a contradiction: If (3) fails for a pair of independent sets A, B, i.e. |A| < |B| but A U {y} is dependent for any y G B \ A, Then we choose a weight function as follows. Let |A| = a, |B| = b where 2b > 2a + 1, and • w(x) = —2b for x G A, • w(x) = —2a — 1 for x G B \ A, • w(x) = 0 otherwise.c The greedy algorithm finds a basis Bi D A, which must be disjoint from B \ A, and hence of cost w(Bi) = —2ab. On the other hand, a basis B2 D B has total cost w(B2) < (—2a — 1)b = —2ab — b < w(Bi). This contradicts minimality of Bi by the greedy algorithm. □ V Petr Hlinený, Fl MU Brr FI: MA010: MST and Greedy Algorithms Example 5.14. Finally, we provide two easy examples of problems for which the greedy approach seriously fails. Graph colouring. This problem asks for an assignment of the least number of distinct "colours" to the graph vertices such that adjacent pairs receive distinct colours. In a given vertex order, we greedily assign the first available colour to each vertex; see an example: ®-•-•-® 12 3 1 Are three colours really necessary here? cOf course not.. . Vertex cover. Now the task is to find a smallest possible subset C of the vertex set such that every edge has some end in C. □ A natural greedy approach might be to select vertices from the highest degrees, right? □ Unfortunately, this procedure again seriously fails in some cases. V/ x □ Petr Hliněný, FI MU Brno 15 FI: MA010: MST and Greedy Algorithms