Part VII Probabilistic method I: Chapter 7. PROBABILISTIC METHOD The probabilistic method is a powerful tool to demonstrate the existence of combinatorial objects. In some cases this method can be used also to derive an algorithm for an algorithmic problem. two Basic arguments of the probabilistic method: 1 The expectation argument: Any random variable V assumes at least one value that is not smaller than its expectation EV , and at least one value that is not greater than its expectation EV . 2 The sampling argument: If an object chosen randomly from a universe/set U satisfies a property P with a positive probability, then there must be an object in U that satisfies the property P. The above two simple ideas have a surprising power. Their power comes from our ability to reformulate, in various ways, counting arguments in the language of probability and then to apply various tools of probability theory. prof. Jozef Gruska IV054 7. Probabilistic method I: 2/61 EXAMPLE Example: One can show that for every n × n 0-1-matrix A, and for any randomly chosen vector b ∈ {−1, +1}n , it holds Ab ≤ 4 √ n ln n with probability at least 1 − 2 n . From that we may conclude that for every such matrix A, there always exists a vector b ∈ {−1, +1}n such that Ab ≤ 4 √ n ln n. prof. Jozef Gruska IV054 7. Probabilistic method I: 3/61 LIMITS of PROBABILISTIC METHODS Probabilistic method is especially useful when we can show that the probability is quite large that the object we look for exists and we can verify whether our random process found such an object. If such a probability is indeed large then we can find such an object quite efficiently just by applying a random searching process - a sampling experiment. In some cases, however, no explicit construction of a combinatorial object is known in spite of the fact that we can show that such object exists. prof. Jozef Gruska IV054 7. Probabilistic method I: 4/61 EXAMPLE Example Using probabilistic method it can be shown that for any n there exists a sorting network that sorts n integers in parallel O(log n) time. A method is known to construct for any n a sorting network that sorts in parallel O(log2 n) time. No method is known to construct for any n a sorting network that could sort in parallel O(log n) time, in spite of many efforts to do that during the last 25 years. prof. Jozef Gruska IV054 7. Probabilistic method I: 5/61 BASIC IDEA and an EXAMPLE Probabilistic method consists of two stages. 1 A ”thought experiment” E is designed in which a carefully chosen random process (called usually as an experiment - for example, a dice tossing) P plays a key role. 2 The random process P is then analyzed and some conclusions are made that are, or at least look as, independent of the experiment E. prof. Jozef Gruska IV054 7. Probabilistic method I: 6/61 EXAMPLE - MAX-CUT PROBLEM Problem: Given is an undirected graph G = (V , E) with n = |V | vertices and m = |E| edges. The task is to partition vertices in V into two sets A and B in such a way that maximizes the number of such edges (u, v), where u ∈ A, v ∈ B. Theorem: For any undirected graph G = (V , E) with n vertices and m edges, there is a partition of the vertex set V into two sets A and B such that |{(u, v) ∈ E | u ∈ A, v ∈ B}| ≥ m 2 . Proof: Let us consider the following experiment: Each vertex of G is independently and equiprobably assigned to either A or B. For any edge (u, v), the probability that its end-vertices are in different sets is 1 2 . By linearity of expectations, the expected number of edges with end-vertices in different sets is thus E[cut-size] = m 2 . That implies that there must be a partition satisfying the theorem. prof. Jozef Gruska IV054 7. Probabilistic method I: 7/61 DERANDOMIZATION In some cases the proof of existence obtained by the probabilistic method can be converted into an efficient randomized algorithm. In some other cases the existence proof obtained by the probabilistic method can be converted even to an efficient deterministic algorithm - such a process is called derandomization. prof. Jozef Gruska IV054 7. Probabilistic method I: 8/61 FROM THE PROOF of EXISTENCE to LAS VEGAS ALGORITHM for MAX-CUT PROBLEM We show how to transform the argument from slide #4 about the existence of a partition with at least m 2 of edges to a Las Vegas algorithm. Let us design a partition C(A, B) using randomization described above and denote p = Pr(C(A, B) ≥ m 2 ) Then m 2 = E[C(A, B)] = i≤m/2−1 iPr(C(A, B) = i) + m/2≤i≤m iPr(C(A, B) = i) ≥ (1 − p) m 2 + 1 which implies that p ≥ 1 m/2 + 1 Therefore, the expected number of samples before finding a cut with value at least m 2 is therefore just m 2 + 1. Since we can test in polynomial time whether the value of the cut determined by a sample is at least m/2, by counting edges crossing the cut, we have a Las Vegas algorithm to find a cut. prof. Jozef Gruska IV054 7. Probabilistic method I: 9/61 ASSIGNMENT of HATS - I. There are n robots in a field and each of them can see only k other robots for fixed n, k. Each robot wants to have a hat and a robot can be happy only if she has a hat and none of robots she sees has a hat. How many robots can be happy? prof. Jozef Gruska IV054 7. Probabilistic method I: 10/61 ASSIGNMENT of HATS - II. There are n robots in a field and each of them can see only k other robots for fixed n, k. Each robot wants to have a hat and a robot can be happy only if she has a hat and none of robots she sees has a hat. How many robots can be happy? Let us give to each robot a hat with a probability p. Then the probability that any particular robot is happy is p(1 − p)k . If Xr is the indicator variable for the event that robot r is happy, then E[Xr ] = p(1 − p)k and the expected number of happy robots is np(1 − p)k and so there has to be a specific assignment of hats that so many robots are happy. Next task is to find such a p that maximizes the value np(1 − p)k . From the equation d dp (np(1 − p)k ) = 0 we get p = 1/(k + 1) and for this p and large k, the expected number of hats is n 1 k + 1 1 − 1 k + 1 k ≤ n e(k + 1) for large k prof. Jozef Gruska IV054 7. Probabilistic method I: 11/61 HAMILTONIAN PATHS in TOURNAMENTS A tournament is a complete directed graph. A Hamiltonian path in a graph G = (V , E) is a path that visits each vertex (representing a player) of V exactly once. Theorem: Every tournament has a Hamiltonian path. Proof will be by the induction on the number n of vertices in a tournament. Induction step. Suppose that every tournament with at most n vertices has a Hamiltonian path and let a tournament T = (V , E) with n + 1 vertices be given. Choose any vertex v and define two sets of vertices A = {u | (u, v) ∈ E} B = {u | (v, u) ∈ E}. Two subgraphs induced by these two sets of vertices form tournaments. By induction both of them have Hamiltonian paths. By connecting these two paths through the node v we get a Hamiltonian path for T. prof. Jozef Gruska IV054 7. Probabilistic method I: 12/61 NUMBER of PATHS in TOURNAMENTS Theorem For any n there exists a tournament of size n with n!/2n−1 Hamiltonian paths. Proof Generate a random tournament T = (V , E), V = {1, . . . , n}, by randomly choosing direction for all edges of Kn - of a complete graph of n vertices. For each permutation σ on V let Xσ be a random variable defined as follows: Xσ = 1, σ defines a path on T 0 otherwise For all σ, Pr(Xσ = 1) = (1 2 )n−1 = E[Xσ]. Let X be a random variable counting the number of Hamiltonian paths in T X = σ∈Perm(n) Xσ. Theorem now follows from the following calculations: E[X] = E[ σ Xσ] = σ E[Xσ] = n!( 1 2 )n−1 . prof. Jozef Gruska IV054 7. Probabilistic method I: 13/61 TOURNAMENTS WITH PROPERTY Sk In a tournament, if there is an edge from a node A to a node B, then we say that the player A beats the player B. A tournament T is said to have property Sk , for an integer k, if for any set of k players there is one player that beats all of them. Theorem If ( n k )(1 − 2−k )n−k < 1, then there is a tournament on n vertices with property Sk . Proof Consider a random tournament with a set of n nodes. For any subset K of k vertices let AK be the event that there is no player/node that beats all players/nodes in K. Clearly, Pr[AK ] = (1 − 2−k )n−k . Therefore Pr[ K⊂V ,|K|=k AK ] ≤ K⊂V ,|K|=k Pr[AK ] = ( n k )(1 − 2−k )n−k < 1 Therefore, with a positive probability no event AK occurs. That is, there is a tournament on n vertices that has the property Sk . prof. Jozef Gruska IV054 7. Probabilistic method I: 14/61 EXPLANATION If K is a set on k players, then the probability that a player P not in K beats all of them is 2−k and the probability is 1 − 2−k that he does not beat all of them. Since there are n − k players outside the group K, the probability that none of them beats all players in K is (1 − 2−1 )n−k . prof. Jozef Gruska IV054 7. Probabilistic method I: 15/61 MAX-SAT - PROBLEM Given are m clauses in conjunctive normal form over n variables. Find assignment (of truth values to variables) that maximizes the number of satisfied clauses. Theorem: For any set of m clauses there is a truth assignment that satisfies at least m 2 clauses. Proof: Suppose each variable is set to 0 or 1 independently and equiprobably and let, for 1 ≤ i ≤ m, the random variable Zi = 1 if the i-th clause is satisfied. The probability that a clause with k literals is not satisfied by this random assignment is 2−k . The probability that a clause with k literals is satisfied is 1 − 1 2k ≥ 1 2 what implies that E[Zi ] ≥ 1 2 for all i. The expected number of clauses satisfied by a random assignment is E[ m i=1 Zi ] = m i=1 E[Zi ] ≥ m 2 Therefore, there exists at least one assignment for which m i=1 Zi ≥ m 2 . prof. Jozef Gruska IV054 7. Probabilistic method I: 16/61 FROM THE PROOF of EXISTENCE to an ALGORITHM 1/2 We show now that a variant of the probabilistic proof of existence in the last theorem can be turned into an approximation algorithm. Notation for approximation algorithms: I - an input instance - a set of clauses. m∗(I) - the maximum number of clauses of I that can be satisfied. mA(I) - the number of clauses of I satisfied by the algorithm A. performance ratio of an algorithm A : infI mA(I) m∗(I) . If A achieves a performance ratio α, we say that A is an α-approximation algorithm. If A is a randomized algorithm, then mA(I) is a random variable and in such a case mA(I) is replaced by E[mA(I)] in the definition of the performance ratio. prof. Jozef Gruska IV054 7. Probabilistic method I: 17/61 FROM a PROOF of EXISTENCE to an ALGORITHM 2/2 We now show the existence of a randomized algorithm for MAXSAT with performance ratio 3 4 . The procedure in the proof of the last theorem actually yields a randomized algorithm whose guaranteed performance is 1 − 2−k , provided every clause contains at least k literals. As a consequence, we have a randomized 3 4 -approximation algorithm for instances of MAX-SAT in which every clause contains at least 2 literals. We now show another algorithm that performs especially well when there are (many) clauses consisting of a single literal. Finally, we show that on any input instance, one of the two designed algorithms yields a randomized 3 4 -approximation algorithm. prof. Jozef Gruska IV054 7. Probabilistic method I: 18/61 BASIC IDEA is similar as in the case of the global wiring problem. 1 Reformulate the problem as a 0-1 linear programming problem. 2 Solve the corresponding rational linear programming problem. 3 Use the randomized rounding technique. Notation: With each clause Cj , in the given input formula, we associate an indicator variable cj ∈ {0, 1}, that indicates whether or not the clause Cj is satisfied. prof. Jozef Gruska IV054 7. Probabilistic method I: 19/61 INDICATOR VARIABLES Moreover, to each variable xi we assign an indicator variable vi defined by xi = true ⇐⇒ vi = 1 . C+ j - set of indices of variables that appear uncomplemented in Cj C− j - set of indices of variables that appear complemented in Cj prof. Jozef Gruska IV054 7. Probabilistic method I: 20/61 0-1 linear programming reformulation of the MAX-SAT problem. Find vi , cj ∈ {0, 1} (∀i, j) ( ) such that the sum m j=1 cj is maximized and i∈C+ j vi + i∈C− j (1 − vi ) ≥ cj (∀j). (1) Rational linear programming problem is then obtained by replacing the condition ( ) by the condition vi , cj ∈ [0, 1] (∀i, j). Let ˆvi (ˆci ) be the value of variable vi (ci ) obtained by solving the rational linear programming problem. Clearly, n i=1 cj ≤ n j=1 ˆcj . prof. Jozef Gruska IV054 7. Probabilistic method I: 21/61 CONTINUATION of THE PROOF 1/2 We first show that using the randomized rounding method we obtain a truth assignment for which the expected number of satisfied clauses is at least (1 − 1 e ) j ˆcj . This will follow from the Lemma shown on the next slide for the case we use the following randomized rounding: each vi is set, independently, to 1 with the probability ˆvi . Notation: For an integer k denote βk = 1 − (1 − 1 k ) k > 1 − 1 e . prof. Jozef Gruska IV054 7. Probabilistic method I: 22/61 CONTINUATION of THE PROOF 2/2 Lemma: Let Cj be a clause with k literals. The probability that it is satisfied by the randomized rounding is at least βk ˆcj > (1 − 1 e )ˆcj . Proof: Without loss of generality we can assume that Cj : x1 ∨ · · · ∨ xk By constrain (1) ˆv1 + · · · + ˆvk ≥ ˆcj . Observe that the clause Cj remains unsatisfied by randomized rounding method only if every variable vi is rounded to 0. Since each variable is rounded fully independently, this occurs with probability k i=1 (1 − ˆvi ). It remains to show that 1 − k i=1 (1 − ˆvi ) left side is minimized if ˆvi = ˆcj k ≥ βk ˆcj prof. Jozef Gruska IV054 7. Probabilistic method I: 23/61 Again: It remains to show that 1 − k i=1 (1 − ˆvi ) andtheleftsideisminimizedif ˆvi = ˆcj k ≥ βk ˆcj This can be shown if one can show that 1 − (1 − c k )k ≥ βk c for all 0 < z < 1. Since function f (x) = 1 − (1 − x k )k is concave, it suffices to verify the above inequality for x = 0 and x = 1 what is easy. prof. Jozef Gruska IV054 7. Probabilistic method I: 24/61 FINAL RESULT I From the last Lemma, and from the linearity of expectations, it follows: Theorem: Given an instance of MAX-SAT, the expected number of clauses satisfied by linear programming and randomized rounding is at least (1 − 1 e ) time the maximum number of clauses that can be satisfied on that instance. Comparison of performances of our two algorithms for MAX-SAT k 1 − 2−k βk 1 0.5 1.0 2 0.75 0.75 3 0.875 0.704 4 0.938 0.684 5 0.969 0.672 We now show that on any instance one of the algorithms is a 3 4 - approximation algorithm: prof. Jozef Gruska IV054 7. Probabilistic method I: 25/61 Let n1 denote the expected number of clauses that are satisfied when each variable is independently set to 1 with probability 1 2 (what corresponds to the first algorithm). Let n2 denote the expected number of clauses that are satisfied when we use the linear programming followed by the randomized rounding (what corresponds to the second algorithm). prof. Jozef Gruska IV054 7. Probabilistic method I: 26/61 FINAL RESULT II Theorem: max {n1, n2} ≥ 3 4 j ˆcj . Proof: It suffices to show that (n1+n2 2 ) ≥ 3 4 j ˆcj . Let Sk denote the set of clauses that contain k literals. We know that n1 = k Cj ∈Sk (1 − 2−k )cj ≥ k Cj ∈Sk (1 − 2−k )ˆcj , n2 ≥ k Cj ∈Sk βk ˆcj . Thus n1 + n2 2 ≥ k Cj ∈Sk (1 − 2−k ) + βk 2 ˆcj . Since (1 − 2−k ) + βk ≥ 3 2 for all k, we get n1 + n2 2 ≥ 3 4 k Cj ∈Sk ˆcj = 3 4 j ˆcj . prof. Jozef Gruska IV054 7. Probabilistic method I: 27/61 RAMSEY NUMBER PROBLEM The Ramsey number R(k, l) is the smallest integer n such that in any 2-coloring of the edges of a complete graph Kn, on n nodes, by red and blue, there either is a red Kk (i.e. a complete subgraph on k vertices with edges coloured red), or there is a blue Kl . Ramsey (1930) showed that R(k, l) is finite for any two integers k and l. We use the probabilistic method to show a lower bound on R(k, k). prof. Jozef Gruska IV054 7. Probabilistic method I: 28/61 LOWER BOUND on the RAMSEY NUMBER Theorem If n k · 21−(k 2) < 1, then R(k, k) > n. Corollary Since n k · 21−(k 2) < 1 for n = 2k/2 (that is if k = 2 lg n), the above theorem implies that R(k, k) > 2k/2 for all k ≥ 3. Proof of Theorem Let n, k satisfy the assumption of the theorem. Consider a random 2-coloring of the edges of Kn by red or blue. For any fixed set R of k vertices, let AR be the event that the induced subgraph of Kn on R is monochromatic (i.e. that either all its edges are red or they are all blue). Clearly, Pr(AR ) = 2 1 2(k 2) = 21−(k 2). Since there are n k possible choices for R, the probability that at least one of the events AR occurs is at most n k 21−(k 2) < 1. Thus, with positive probability, no event AR occurs and therefore there is a 2-coloring of Kn without a monochromatic Kk , that is, R(k, k) > n. Note that if k ≥ 3 and n = 2k/2 , then n k 21−(k 2) < 21+k/2 k! nk 2k2/2 < 1 and hence R(k, k) > 2k/2 for all k ≥ 3. prof. Jozef Gruska IV054 7. Probabilistic method I: 29/61 ALGORITHMIC CONSEQUENCES Last theorem implies that there is an edge two-coloring of Kn without a monochromatic K2 lg2 n. It is therefore natural to ask whether we can find efficiently such a coloring. Since there are 2 ( n 2 ) possible colorings, an exhaustive search cannot be efficient. However, a closer look at the proof of the last theorem shows that the proof can be used to produce effectively a coloring that is very likely to be good. This is due to the fact for large k if n = 2k/2 , then ( n k )2 1−( k 2 ) < 21+ k 2 k! ( n 2k/2 )k ≤ 21+ k 2 k! << 1 because ( n k ) ≤ nk k! . Hence, a random coloring of Kn is very likely not to contain a monochromatic K2 lg2 n. prof. Jozef Gruska IV054 7. Probabilistic method I: 30/61 A CONSEQUENCE As a consequence of previous results, if we need to find a two-coloring of edges of K1024 without a monochromatic K20 we can simply produce a random two-coloring and then the probability that it contains a monochromatic K20 is less than 211 20! what is much, much less than probability of error in any proof that a certain coloring is good. prof. Jozef Gruska IV054 7. Probabilistic method I: 31/61 SOME RAMSEY’s NUMBERS For some Ramsey’s numbers see http://en.wikipedia.org/wiki/Ramsey’s theorem For example, R(3, 3) = 6, R(4, 4) = 18 43 ≤ R(5, 5) ≤ 49. prof. Jozef Gruska IV054 7. Probabilistic method I: 32/61 THE PARTY PROBLEM Ramsey problem is also called Party problem. Nodes of a Kn graph are seen as a party participants and two of them are connected by a red (blue) edge if they are friends (strangers). Ramsey number R(k, l) is the smallest number n such that at any party of n people there are at least k that are mutually friends and l that are mutually strangers. In 1993 S. P. Radziszowski and B. D. McKay showed that R(4, 5) = 25. They estimate that their computer proof consumed an equivalent of 11 years of computation by a standard desktop computer. prof. Jozef Gruska IV054 7. Probabilistic method I: 33/61 DELETION METHOD So called deletion method can be useful in some cases when it seems to be difficult to apply the probabilistic method directly. The proof, by the deletion method, that a certain combinatorial object O exists, consists, conceptually, of two stages: It is first shown that with a positive probability an object O exists that is very close, in some sense, to O. Secondly, O is changed, to obtain O, and it is shown that the probability of the existence of O remains positive. prof. Jozef Gruska IV054 7. Probabilistic method I: 34/61 MIN-MAX TRIANGLE PROBLEM - I. Let S = {p1, . . . , pn} be a set of points located in a unit square of the plane. Consider a set TS of all triangles whose vertices are points of S and let T(S) be the area of that triangle from TS the area of which is minimal. The following theorem asserts that for any n there is a set S of n points such that T(S) is not too small. Namely, it holds: Theorem For any n there is a set S of n points in a unit square such that T(S) ≥ 1 100n2 . PROOF. BASIC IDEA: 2n points are chosen randomly in the unit square. All triangles created by these points are tested and those that are “too small” are eliminated - by deleting one vertex from each of them until only n of nodes are left. The idea is that this way we will be left with enough points, and with no too-small-area triangle. prof. Jozef Gruska IV054 7. Probabilistic method I: 35/61 MIN-MAX TRIANGLE PROBLEM - II. PROOF Choose uniformly 2n points in the unit square. For points p, q, r let A(p, q, r) denote the area of the triangle these points create. For any real numbers 0 ≤ b and ∆b ≤ 1 it holds Pr[b ≤ ||p − q|| ≤ b + ∆b] ≤ π(b + ∆b)2 − πb2 = 2πb∆b + π(∆b)2 , where ||p − q|| denote the Euclidean distance between p and q. (Observe that inequality follows from the fact that rings with radiuses b and b + ∆b and centre in p may not be completely contained in the unit square.) Let (p, q) be the base of the triangle (p, q, r) and let ||p − q|| = b. prof. Jozef Gruska IV054 7. Probabilistic method I: 36/61 MIN-MAX TRIANGLE PROBLEM - III. We show now how to estimate Pr[A(p, q, r) ≤ ε] for any ε > 0 and p.q.r. Such an event happens when the height h of the triangle is ≤ 2ε b and therefore r is not farther than 2ε b from the line of points p and q. The probability that this happens is less than 2 √ 2ε b , because r has to be in a strip of width 2ε b and length less than √ 2. Hence, Pr[A(p, q, r) ≤ ε] = √ 2 b=0 Pr[b ≤ ||p − q|| ≤ b + ∆b] × Pr[triangle. h. ≤ 2ε b ] ≤ √ 2 b=0 (4πb) 2 √ 2ε b ∆b = 16πε. prof. Jozef Gruska IV054 7. Probabilistic method I: 37/61 MIN-MAX TRIANGLE PROBLEM - IV Let us now compute the expected number of triangles with the area ≤ ε = 1 100n2 . Let S be a set of 2n points uniformly distributed in the unit square. For each triple (pi , qi , ri ) in S let Xpi ,qi ,ri be the indicator variable having value 1 if the area of the triangle determined by (pi , qi , ri ) is less than ε = 1 100n2 . The probability that the area of some specific triangle is less than 1 100n2 is less than 16π 100n2 ≤ 0.6 n2 – This is also the expected value of Xpi ,qi ,ri . prof. Jozef Gruska IV054 7. Probabilistic method I: 38/61 MIN-MAX TRIANGLE PROBLEM - V. If X denotes the number of triangles with area less than 1 100n2 , then E[X] = p,q,r∈S E[Xpi ,qi ,ri ] ≤ 2n 3 0.6n−2 ≤ n. Finally, by throwing away an arbitrary vertex from each of such “small area triangles”, we are left with a new set S of points the expected size of which (of S ), is not less than n, in which no small-area-triangles exist. Therefore, there exists a set S , of size n, such that T(S ) ≥ 1 100n2 . prof. Jozef Gruska IV054 7. Probabilistic method I: 39/61 EXAMPLE - INDEPENDENT SETS OF VERTICES We show a lower bound on the size of the largest independent set of vertices in certain graphs. Definition An independent set of a graph G = (V , E) is a subset of vertices of V such that no two vertices in the set are adjacent. Denote by α(G) the size of the largest independent set of vertices of the graph G. Our aim is to prove the following lower bound: Theorem If |V | = n and |E| = nk/2 for a graph G = (V , E), then α(G) ≥ n 2k . Proof ideas: To choose randomly a subset of vertices that would be a candidate for an independent set. To show, using the probabilistic argument, that there is a subset of the chosen set of vertices that has many more vertices than edges. By deleting one vertex from each of such edges, an independent set is produced. prof. Jozef Gruska IV054 7. Probabilistic method I: 40/61 PROOF Create a set S ⊂ V by putting into S each vertex independently with probability p (to be specified later). It therefore holds for the average size of S that: E[|S|] = np. Let GS be the subgraph of G = (V , E), induced by S. For any e ∈ E let Ye be the indicator variable that has value 1 if e ∈ E(GS ) - the set of edges of GS - and 0 otherwise. E[Ye ] = p2 because an edge belongs to E(GS ) iff both if its endpoints are in S, what happens with probability p2 . Let Y = |E(GS )|. It holds E[Y ] = E[ e∈E Ye ] = e∈E E[Ye ] = nk 2 p2 . After deleting all edges from GS , by dropping a vertex from each of such edges, it remains a set S∗ of the expected size E[|S∗ |] = E[|S| − Y ], and therefore E[|S| − Y ] = E[|S|] − E[Y ] = np − nk 2 p2 . The last expression has the largest value for p = 1 k and in such a case E[|S∗ |] = E[|S| − Y ] = n 2k . prof. Jozef Gruska IV054 7. Probabilistic method I: 41/61 EXPANDING GRAPHS Informally, an expanding graph is a graph in which the number of neighbors of any sufficiently small set of vertices S is larger than c|S| for some positive constant c > 1. OR-concentrators are a special type of expanding graphs. Definition:An (n, d, α, c) OR-concentrator is a bipartite multigraph G = (L, R, E) with independent sets of vertices L and R, each of cardinality n, such that 1 Every vertex in L has degree at most d; 2 For any subset S of vertices from L such that |S| ≤ αn there are more than c|S| neighbors in R. For applications, it is usually desirable to have d as small and c as large as possible. Of particular interest is to study OR-concentrators in which α, c and d are constants fixed independently of n, with c > 1. Finding an explicit construction of OR-concentrators is a non-trivial task. However, the probabilistic method can be used to show the existence of such concentrators. prof. Jozef Gruska IV054 7. Probabilistic method I: 42/61 Theorem: There is an integer n0 such that for all n > n0 there is an (n, 18, 1 3 , 2) OR-concentrator. Proof: The first part of the proof will be for all (n, d, c, α) OR-concentrators. Consider a random bipartite graph with two disjoint sets of vertices, L and R, each of n vertices, in which each vertex of L chooses randomly and independently d vertices from R as neighbours. For any integer s let ξs denote the event that a (bad) subset of s vertices of L has fewer than cs neighbours in R. We first derive an upper bound on Pr[ξs ] and then an upper bound on the sum of Pr[ξs ] over all s ≤ αn. This way we obtain a non-trivial upper bound on the probability that a random graph fails to be an OR-concentrator with parameters we seek. Fix any subset S ⊆ L of size s, and any subset T ⊆ R of size cs. (There are n s ways of choosing S, and n cs ways of choosing T.) The probability that T contains all of at most ds neighbours of the vertices in S is (cs n )ds . prof. Jozef Gruska IV054 7. Probabilistic method I: 43/61 The probability of the event that all ds edges going out from some s vertices of L fall within any cs vertices of R is bounded by Pr[ξs ] ≤ n s n cs cs n ds . Using the inequality n s ≤ ne s s we obtain Pr[ξs ] ≤ n s n cs cs n ds ≤ ne s s ne cs cs cs n ds = s n d−c−1 e1+c cd−c s . prof. Jozef Gruska IV054 7. Probabilistic method I: 44/61 In a special case, for α = 1 3 and s ≤ αn we have Pr[ξs ] = 1 3 d−c−1 e1+c cd−c s ≤ c 3 d (3e)c+1 s , and, in addition, for c = 2, d = 18 Pr[ξs ] = 2 3 18 (3e)3 s . Since: r = 2 3 18 (3e)3 ≤ 1 2 we have s≥1 Pr[ξs ] ≤ s≥1 rs = r 1 − r < 1. prof. Jozef Gruska IV054 7. Probabilistic method I: 45/61 OBLIVIOUS ROUTING REVISITED - I. As another example, we will show that probabilistic method can be used to prove the existence of an algorithm. We focus on the minimal number of random bits needed by randomized oblivious routing algorithms for hypercubes. and we derive the following results. 1 A proof of the existence (by the probabilistic method) of a randomized routing algorithm that uses (within a constant factor) only 3d random bits to route d-dimensional hypercube. As a consequence we get that our randomized oblivious routing algorithm, from previous chapter, that used d2d random bits to route a d-dimensional hypercube, uses much too much random bits. prof. Jozef Gruska IV054 7. Probabilistic method I: 46/61 OBLIVIOUS ROUTING REVISITED - II. To remember: 2d d is a lower bound on oblivious deterministic routings on a d-dimensional hypercube Hd . Question: How much randomness (how many random bits) is (are) needed to have a routing algorithm with the expected running time O(d)? Observation I: A randomized oblivious algorithm for permutation routing is a probability distribution on a set of deterministic oblivious routing algorithms. Observation II. Each deterministic oblivious algorithm for a 2d -node network is a set of 22d = 2d × 2d routes, one for each source-target pair. Note: Every randomized oblivious algorithm can be expressed by sequences (A1, . . . , Ar ), (p1, . . . , pr ), where each Aj is a deterministic oblivious routing algorithm and each pj is the probability that we use Aj on a run of the randomized routing algorithm. prof. Jozef Gruska IV054 7. Probabilistic method I: 47/61 Note: We know: 1 With 0 random bits the expected running time of any oblivious routing algorithm on Hd is Ω 2d d ; 2 For the randomized oblivious routing the expected running time is O(d) and d 2d random bits are used (each of 2d nodes chooses a random d-bit auxiliary goal). Are so many random bits indeed necessary for efficient randomized routing? prof. Jozef Gruska IV054 7. Probabilistic method I: 48/61 Theorem: For every d there exists a randomized oblivious scheme (algorithm) for a permutation routing on the hypercube with n = 2d nodes that uses only 3d random bits and still runs in the expected time 15d at most. Proof: Notation We say that a set B = {B1, . . . , Bt } of deterministic oblivious permutation routing algorithms on Hd is an efficient routing scheme, if for any input instance, the expected number of steps using a randomly chosen algorithm from B is at most 15d. To prove the theorem, we show that for every n = 2d there is an efficient routing scheme for Hd with t = 23d = n3 . Our resulting randomized routing scheme will randomly choose n3 of nn possible deterministic oblivious routing algorithms. (nn is due to the fact that there are n sources and for each one we can choose from n possible intermediate destinations.) Let us denote such deterministic algorithms by Aj , 1 ≤ j ≤ nn . On an n-node network there are n! distinct possible instances of the permutation routing problem, one for each permutation on {1, 2, . . . , n}. prof. Jozef Gruska IV054 7. Probabilistic method I: 49/61 For a permutation πi , 1 ≤ i ≤ n!, let us call a deterministic oblivious routing algorithm Aj good if Aj routes πi in 14d or fewer steps, and bad otherwise. By our randomized routing result: (with probability at least 1 − 1 n every packet reaches its destination in 14n or fewer steps) for any particular πi a fraction of at most 1 n of the algorithms Aj are bad - which of them are bad may differ from instance to instance. Experiment:Choose n3 indices i1, . . . , in3 , randomly, independently and uniformly from the set {1, . . . , nn }. We show that the set of deterministic algorithms A = {Ai1 , . . . , Ain3 } is an efficient routing scheme with a positive probability. This will imply that an efficient routing scheme exists for any n = 2d . For any πi , a fraction of at most 1 n of the algorithms A1, . . . , Ann is bad. Therefore, the expected number of algorithms in A that are bad for πi is at most n3 · 1 n = n2 . prof. Jozef Gruska IV054 7. Probabilistic method I: 50/61 Let the indicator variable Xj be 1 if Aij is bad, 1 ≤ j ≤ n3 , and 0 otherwise. Thus E[ j Xj ] ≤ n2 . Since Xj are independent, we may apply Chernoff bound on X = Ai to get Pr[X ≥ (1 + δ)µ] ≤ F+ (µ, δ) < e− µδ2 4 (Chapter 6, page 10), to obtain (for µ = n2 , δ = 1) that the probability that more than 2n2 of the algorithms in A are bad for πi is ≤ e −n2 4 . Let Bi denote the bad event that more than 2n2 algorithms in A are bad for πi . Then, for n ≥ 4 Pr[ n! i=1 Bi ] ≤ n! i=1 Pr(Bi ) ≤ n! e− n2 4 < 1 (by Stirling’s formula for n!) Therefore, with positive probability, not more than 2n2 algorithms in A are bad for any πi . Hence, there exists a subset of n3 algorithms from {A1, . . . , Ann } with the property that at most 2n2 in this subset are bad for any πi . Let us denote this subset B = {Bi1 , . . . , Bin3 }. B is an efficient routing scheme: for any πi a randomly chosen algorithm form B fails to route πi within 14n steps with probability at most 2n2 n3 = 2 n . From that one can deduce that the expected number of steps using an algorithm randomly chosen from B is less than 15n. prof. Jozef Gruska IV054 7. Probabilistic method I: 51/61 THE LOV´ASZ LOCAL LEMMA prof. Jozef Gruska IV054 7. Probabilistic method I: 52/61 THE LOV´ASZ LOCAL LEMMA - MOTIVATION I This is one of the most elegant and useful tools to apply the probabilistic method. Suppose that we have a finite set A of bad events such that each of them may not happen with a non-zero probability. We want to show that under certain circumstances none of these events happen. This is easy to show for the case events are independent. Indeed, in such a case Pr   A∈A ¯A   = A∈A Pr[ ¯A] > 0. Lov´asz Local lemma handle the situation for the case where events are generally not independent of each other, but each collection of events that are not independent of some particular event A has low total probability. The proof of the original version of Lov´asz Local Lemma was non-constructive - it gave no guidance how to find a particular outcome that makes all the events false. Later, it has been shown that when events are determined by some underlying set of independent variables and independence between two events is detected by having non-overlapping sets of underlying variables, an actual solution can be found in polynomial expected time. prof. Jozef Gruska IV054 7. Probabilistic method I: 53/61 THE LOVASZ LOCAL LEMMA - SYMMETRIC VERSION In order to formulate so called symmetric version of the lemma we need the following definition of mutual independence. An event E is mutually independent of events E1, . . . , En, if for any I ⊆ [1, n], Pr E | i∈I Ei = Pr(E). Dependency between events can be represented in terms of a dependency graph. Definition: Dependency graph for a set of events E1, . . . , En is a graph G = (V , E) such that V = {1, . . . , n} and for i = 1, . . . , n, the event Ei is mutually independent of the events {Ej | (i, j) ∈ E}. Lovasz Local Lemma Let E1, . . . , En be events and p, d be fixed numbers such that for all i, Pr(Ei ) ≤ p; the degree of the dependency graph given by E1, . . . , En is bounded by a d; 4dp ≤ 1. Then Pr n i=1 ¯Ei > 0. prof. Jozef Gruska IV054 7. Probabilistic method I: 54/61 APPLICATION - EDGE-DISJOINT PATHS Assume that n pairs of users need to communicate using edge-disjoint paths, from a fixed set of paths, on a given network . Assume that each ith pair of users can choose a path from a collection Fi of m paths for a fixed m. We show, using simple version of Lovasz local lemma that, if the possible paths do not share too many edges, then there is a way to choose n edge-disjoint paths connecting the n given pairs. Theorem: If, for a fixed k, any path in any Fi shares edges with no more than k paths in any Fj , j = i, and 8nk/m ≤ 1, then there is a way to choose n edge-disjoint paths connecting the n given pairs. prof. Jozef Gruska IV054 7. Probabilistic method I: 55/61 PROOF of THEOREM Let each ith pair chooses a path from Fi randomly, with probability 1 m . Let Ei,j be the event that paths chosen by pairs i and j share at least one edge. Since for any i and j = i the path chosen from Fi shares edges with at most k paths in any Fj , j = i, we have p = Pr(Ei,j ) ≤ k m . Let d be the degree of the dependency graph of all events Ei,j . Since the event Ei,j is independent of all events Ei ,j , when i ∈ {i, j} and j ∈ {i, j}, we have d < 2n. Since 4dp < 8nk m ≤ 1 all conditions of the Lovacz local lemma are satisfied and therefore Pr   i=j ¯Ei,j   > 0 and therefore there is such a choice of n paths that are edge-disjoint. prof. Jozef Gruska IV054 7. Probabilistic method I: 56/61 THE LOV´ASZ LOCAL LEMMA - GENERAL CASE Suppose we have n events each of which occur with probability at most 1 2 . Let each of these events corresponds to one of n ways in which a probabilistic experiment could fail. If the event were independent, we could then assert that at such an experiment with probability at least 2−n , none of the events occurs. The Lov´asz Local Lemma generalizes the above setting and result to the case where each of the events is independent of all but small number of other events. Lov´asz Local Lemma: Let G = (V , E) be the dependency graph for events ξ1, . . . , ξn in a probability space. Suppose that there exist xi ∈ [0, 1], for 1 ≤ i ≤ n, such that Pr[ξi ] ≤ (i,j)∈E (1 − xj ). Then Pr[ n i=1 ¯ξi ] ≥ n i=1 (1 − xi ). prof. Jozef Gruska IV054 7. Probabilistic method I: 57/61 PAUL ERD¨OS Paul Erd¨os, a Hungarian mathematician, died in September 1996 at the age of 83. He was the most prolific mathematicians of the twentieth century with over 1500 papers written and more than 490 collaborators. He can be seen as the founder (in 1947) of the probabilistic method. prof. Jozef Gruska IV054 7. Probabilistic method I: 58/61 THOUGHTS ABOUT ERD¨OS In our century, in which mathematics is so strongly dominating by ”theory constructors” Erd¨os remained the prince of problem solvers and the absolute monarch of problem posers. (E. Strauss) Erd¨os had no job; he worked constantly. He had no home, the world was his home. Possessions were a nuisance, money a bore. He lived on a web of trust, traveling ceaselessly from Center to Center, spreading his mathematical pollen. His enormous talents and energies were given entirely to the Temple of Mathematics. prof. Jozef Gruska IV054 7. Probabilistic method I: 59/61 STORIES about ERD¨OS It is six in the morning. The house is sleeping. I prove and conjecture. Paul Erd¨os in a letter to Vera S´os prof. Jozef Gruska IV054 7. Probabilistic method I: 60/61 STORY about the BOOK Story about The Book. Erd¨os liked to talk about The Book, that contains all theorems of mathematics and for each of them one proof beautiful, aesthetic and insightful - so called The Book proof. Each time one of his conjectures was resolved in ”an ugly way” Erd¨os congratulated the prover, but added ”let us now look for a Book proof”. In 1985 he started his lecture in a math camp by saying: ”You do not have to believe in the God, but you should believe in The Book”. prof. Jozef Gruska IV054 7. Probabilistic method I: 61/61