Part VII
Probabilistic method I:
Chapter 7. PROBABILISTIC METHOD
The probabilistic method is a powerful tool to demonstrate the existence of
combinatorial objects.
In some cases this method can be used also to derive an algorithm for an algorithmic
problem.
two Basic arguments of the probabilistic method:
1 The expectation argument: Any random variable V assumes at least one value that
is not smaller than its expectation EV , and at least one value that is not greater
than its expectation EV .
2 The sampling argument: If an object chosen randomly from a universe/set U
satisﬁes a property P with a positive probability, then there must be an object in U
that satisﬁes the property P.
The above two simple ideas have a surprising power.
Their power comes from our ability to reformulate, in various ways, counting arguments
in the language of probability and then to apply various tools of probability theory.
prof. Jozef Gruska IV054 7. Probabilistic method I: 2/61
EXAMPLE
Example: One can show that for every n × n 0-1-matrix A, and for any randomly
chosen vector b ∈ {−1, +1}n
, it holds
Ab ≤ 4
√
n ln n
with probability at least 1 − 2
n
.
From that we may conclude that for every such matrix A, there always exists a vector
b ∈ {−1, +1}n
such that
Ab ≤ 4
√
n ln n.
prof. Jozef Gruska IV054 7. Probabilistic method I: 3/61
LIMITS of PROBABILISTIC METHODS
Probabilistic method is especially useful when we can show that the
probability is quite large that the object we look for exists and we can
verify whether our random process found such an object.
If such a probability is indeed large then we can ﬁnd such an object
quite eﬃciently just by applying a random searching process - a
sampling experiment.
In some cases, however, no explicit construction of a combinatorial
object is known in spite of the fact that we can show that such object
exists.
prof. Jozef Gruska IV054 7. Probabilistic method I: 4/61
EXAMPLE
Example Using probabilistic method it can be shown that for any n there exists a
sorting network that sorts n integers in parallel O(log n) time.
A method is known to construct for any n a sorting network that sorts in parallel
O(log2
n) time.
No method is known to construct for any n a sorting network that could sort in
parallel O(log n) time, in spite of many eﬀorts to do that during the last 25 years.
prof. Jozef Gruska IV054 7. Probabilistic method I: 5/61
BASIC IDEA and an EXAMPLE
Probabilistic method consists of two stages.
1 A ”thought experiment” E is designed in which a carefully chosen
random process (called usually as an experiment - for example, a dice
tossing) P plays a key role.
2 The random process P is then analyzed and some conclusions are made
that are, or at least look as, independent of the experiment E.
prof. Jozef Gruska IV054 7. Probabilistic method I: 6/61
EXAMPLE - MAX-CUT PROBLEM
Problem: Given is an undirected graph G = (V , E) with n = |V | vertices and m = |E|
edges. The task is to partition vertices in V into two sets A and B in such a way that
maximizes the number of such edges (u, v), where u ∈ A, v ∈ B.
Theorem: For any undirected graph G = (V , E) with n vertices and m edges, there is a
partition of the vertex set V into two sets A and B such that
|{(u, v) ∈ E | u ∈ A, v ∈ B}| ≥
m
2
.
Proof: Let us consider the following experiment: Each vertex of G is independently and
equiprobably assigned to either A or B.
For any edge (u, v), the probability that its end-vertices are in diﬀerent sets is 1
2
.
By linearity of expectations, the expected number of edges with end-vertices in diﬀerent
sets is thus
E[cut-size] =
m
2
.
That implies that there must be a partition satisfying the theorem.
prof. Jozef Gruska IV054 7. Probabilistic method I: 7/61
DERANDOMIZATION
In some cases the proof of existence obtained by the probabilistic
method can be converted into an eﬃcient randomized algorithm.
In some other cases the existence proof obtained by the
probabilistic method can be converted even to an eﬃcient
deterministic algorithm - such a process is called derandomization.
prof. Jozef Gruska IV054 7. Probabilistic method I: 8/61
FROM THE PROOF of EXISTENCE to LAS VEGAS ALGORITHM
for MAX-CUT PROBLEM
We show how to transform the argument from slide #4 about the existence of a partition
with at least m
2
of edges to a Las Vegas algorithm.
Let us design a partition C(A, B) using randomization described above and denote
p = Pr(C(A, B) ≥
m
2
)
Then
m
2
= E[C(A, B)] =
i≤m/2−1
iPr(C(A, B) = i) +
m/2≤i≤m
iPr(C(A, B) = i) ≥ (1 − p)
m
2
+ 1
which implies that
p ≥
1
m/2 + 1
Therefore, the expected number of samples before ﬁnding a cut with value at least m
2
is
therefore just m
2
+ 1.
Since we can test in polynomial time whether the value of the cut determined by a
sample is at least m/2, by counting edges crossing the cut, we have a Las Vegas
algorithm to ﬁnd a cut.
prof. Jozef Gruska IV054 7. Probabilistic method I: 9/61
ASSIGNMENT of HATS - I.
There are n robots in a ﬁeld and each of them can see only k other
robots for ﬁxed n, k.
Each robot wants to have a hat and a robot can be happy only if she
has a hat and none of robots she sees has a hat. How many robots
can be happy?
prof. Jozef Gruska IV054 7. Probabilistic method I: 10/61
ASSIGNMENT of HATS - II.
There are n robots in a ﬁeld and each of them can see only k other robots for ﬁxed n, k.
Each robot wants to have a hat and a robot can be happy only if she has a hat and
none of robots she sees has a hat. How many robots can be happy?
Let us give to each robot a hat with a probability p. Then the probability that any
particular robot is happy is p(1 − p)k
.
If Xr is the indicator variable for the event that robot r is happy, then E[Xr ] = p(1 − p)k
and the expected number of happy robots is np(1 − p)k
and so there has to be a speciﬁc
assignment of hats that so many robots are happy.
Next task is to ﬁnd such a p that maximizes the value np(1 − p)k
. From the equation
d
dp
(np(1 − p)k
) = 0
we get p = 1/(k + 1) and for this p and large k, the expected number of hats is
n
1
k + 1
1 −
1
k + 1
k
≤
n
e(k + 1)
for large k
prof. Jozef Gruska IV054 7. Probabilistic method I: 11/61
HAMILTONIAN PATHS in TOURNAMENTS
A tournament is a complete directed graph.
A Hamiltonian path in a graph G = (V , E) is a path that visits each vertex
(representing a player) of V exactly once.
Theorem: Every tournament has a Hamiltonian path.
Proof will be by the induction on the number n of vertices in a tournament.
Induction step. Suppose that every tournament with at most n vertices has a
Hamiltonian path and let a tournament T = (V , E) with n + 1 vertices be given.
Choose any vertex v and deﬁne two sets of vertices
A = {u | (u, v) ∈ E} B = {u | (v, u) ∈ E}.
Two subgraphs induced by these two sets of vertices form tournaments. By induction
both of them have Hamiltonian paths.
By connecting these two paths through the node v we get a Hamiltonian path for T.
prof. Jozef Gruska IV054 7. Probabilistic method I: 12/61
NUMBER of PATHS in TOURNAMENTS
Theorem For any n there exists a tournament of size n with n!/2n−1
Hamiltonian paths.
Proof Generate a random tournament T = (V , E), V = {1, . . . , n}, by randomly
choosing direction for all edges of Kn - of a complete graph of n vertices.
For each permutation σ on V let Xσ be a random variable deﬁned as follows:
Xσ =
1, σ deﬁnes a path on T
0 otherwise
For all σ, Pr(Xσ = 1) = (1
2
)n−1
= E[Xσ].
Let X be a random variable counting the number of Hamiltonian paths in T
X =
σ∈Perm(n)
Xσ.
Theorem now follows from the following calculations:
E[X] = E[
σ
Xσ] =
σ
E[Xσ] = n!(
1
2
)n−1
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 13/61
TOURNAMENTS WITH PROPERTY Sk
In a tournament, if there is an edge from a node A to a node B, then we say that the
player A beats the player B.
A tournament T is said to have property Sk , for an integer k, if for any set of k
players there is one player that beats all of them.
Theorem If (
n
k
)(1 − 2−k
)n−k
< 1, then there is a tournament on n vertices with property
Sk . Proof Consider a random tournament with a set of n nodes. For any subset K of k
vertices let AK be the event that there is no player/node that beats all players/nodes in
K. Clearly,
Pr[AK ] = (1 − 2−k
)n−k
.
Therefore
Pr[
K⊂V ,|K|=k
AK ] ≤
K⊂V ,|K|=k
Pr[AK ] = (
n
k
)(1 − 2−k
)n−k
< 1
Therefore, with a positive probability no event AK occurs. That is, there is a tournament
on n vertices that has the property Sk .
prof. Jozef Gruska IV054 7. Probabilistic method I: 14/61
EXPLANATION
If K is a set on k players, then the probability that a player P not in K
beats all of them is 2−k
and the probability is 1 − 2−k
that he does
not beat all of them.
Since there are n − k players outside the group K, the probability that
none of them beats all players in K is (1 − 2−1
)n−k
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 15/61
MAX-SAT - PROBLEM
Given are m clauses in conjunctive normal form over n variables. Find assignment (of
truth values to variables) that maximizes the number of satisﬁed clauses.
Theorem: For any set of m clauses there is a truth assignment that satisﬁes at least m
2
clauses.
Proof: Suppose each variable is set to 0 or 1 independently and equiprobably and let, for
1 ≤ i ≤ m, the random variable Zi = 1 if the i-th clause is satisﬁed.
The probability that a clause with k literals is not satisﬁed by this random assignment is
2−k
.
The probability that a clause with k literals is satisﬁed is 1 − 1
2k ≥ 1
2
what implies that
E[Zi ] ≥ 1
2
for all i.
The expected number of clauses satisﬁed by a random assignment is
E[
m
i=1
Zi ] =
m
i=1
E[Zi ] ≥
m
2
Therefore, there exists at least one assignment for which m
i=1 Zi ≥ m
2
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 16/61
FROM THE PROOF of EXISTENCE to an ALGORITHM 1/2
We show now that a variant of the probabilistic proof of existence in the last theorem can
be turned into an approximation algorithm.
Notation for approximation algorithms:
I - an input instance - a set of clauses.
m∗(I) - the maximum number of clauses of I that can be satisﬁed.
mA(I) - the number of clauses of I satisﬁed by the algorithm A.
performance ratio of an algorithm A : infI
mA(I)
m∗(I)
.
If A achieves a performance ratio α, we say that A is an α-approximation algorithm.
If A is a randomized algorithm, then mA(I) is a random variable and in such a case mA(I)
is replaced by E[mA(I)] in the deﬁnition of the performance ratio.
prof. Jozef Gruska IV054 7. Probabilistic method I: 17/61
FROM a PROOF of EXISTENCE to an ALGORITHM 2/2
We now show the existence of a randomized algorithm for MAXSAT with
performance ratio 3
4
.
The procedure in the proof of the last theorem actually yields a randomized algorithm
whose guaranteed performance is 1 − 2−k
, provided every clause contains at least k
literals.
As a consequence, we have a randomized 3
4
-approximation algorithm for instances of
MAX-SAT in which every clause contains at least 2 literals.
We now show another algorithm that performs especially well when there are (many)
clauses consisting of a single literal.
Finally, we show that on any input instance, one of the two designed algorithms yields a
randomized 3
4
-approximation algorithm.
prof. Jozef Gruska IV054 7. Probabilistic method I: 18/61
BASIC IDEA
is similar as in the case of the global wiring problem.
1 Reformulate the problem as a 0-1 linear programming problem.
2 Solve the corresponding rational linear programming problem.
3 Use the randomized rounding technique.
Notation: With each clause Cj , in the given input formula, we associate
an indicator variable cj ∈ {0, 1}, that indicates whether or not the clause
Cj is satisﬁed.
prof. Jozef Gruska IV054 7. Probabilistic method I: 19/61
INDICATOR VARIABLES
Moreover, to each variable xi we assign an indicator
variable vi deﬁned by
xi = true ⇐⇒ vi = 1 .
C+
j - set of indices of variables that appear
uncomplemented in Cj
C−
j - set of indices of variables that appear complemented
in Cj
prof. Jozef Gruska IV054 7. Probabilistic method I: 20/61
0-1 linear programming reformulation of the MAX-SAT problem.
Find
vi , cj ∈ {0, 1} (∀i, j) ( )
such that the sum
m
j=1
cj
is maximized and
i∈C+
j
vi +
i∈C−
j
(1 − vi ) ≥ cj (∀j). (1)
Rational linear programming problem is then obtained by replacing the condition ( )
by the condition
vi , cj ∈ [0, 1] (∀i, j).
Let ˆvi (ˆci ) be the value of variable vi (ci ) obtained by solving the rational linear
programming problem. Clearly, n
i=1 cj ≤ n
j=1 ˆcj .
prof. Jozef Gruska IV054 7. Probabilistic method I: 21/61
CONTINUATION of THE PROOF 1/2
We ﬁrst show that using the randomized rounding method we obtain a truth
assignment for which the expected number of satisﬁed clauses is at least
(1 − 1
e
)
j
ˆcj .
This will follow from the Lemma shown on the next slide for the case we use the following
randomized rounding: each vi is set, independently, to 1 with the probability ˆvi .
Notation: For an integer k denote βk = 1 − (1 − 1
k
)
k
> 1 − 1
e
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 22/61
CONTINUATION of THE PROOF 2/2
Lemma: Let Cj be a clause with k literals. The probability that it is satisﬁed by the
randomized rounding is at least βk ˆcj > (1 − 1
e
)ˆcj .
Proof: Without loss of generality we can assume that
Cj : x1 ∨ · · · ∨ xk
By constrain (1) ˆv1 + · · · + ˆvk ≥ ˆcj .
Observe that the clause Cj remains unsatisﬁed by randomized rounding method only if
every variable vi is rounded to 0.
Since each variable is rounded fully independently, this occurs with probability
k
i=1
(1 − ˆvi ).
It remains to show that
1 −
k
i=1
(1 − ˆvi )
left side is minimized if ˆvi =
ˆcj
k
≥ βk ˆcj
prof. Jozef Gruska IV054 7. Probabilistic method I: 23/61
Again: It remains to show that
1 −
k
i=1
(1 − ˆvi )
andtheleftsideisminimizedif ˆvi =
ˆcj
k
≥ βk ˆcj
This can be shown if one can show that 1 − (1 − c
k
)k
≥ βk c for all 0 < z < 1.
Since function f (x) = 1 − (1 − x
k
)k
is concave, it suﬃces to verify the above inequality
for x = 0 and x = 1 what is easy.
prof. Jozef Gruska IV054 7. Probabilistic method I: 24/61
FINAL RESULT I
From the last Lemma, and from the linearity of expectations, it follows:
Theorem: Given an instance of MAX-SAT, the expected number of clauses satisﬁed by
linear programming and randomized rounding is at least (1 − 1
e
) time the maximum
number of clauses that can be satisﬁed on that instance.
Comparison of performances of our two algorithms for MAX-SAT
k 1 − 2−k
βk
1 0.5 1.0
2 0.75 0.75
3 0.875 0.704
4 0.938 0.684
5 0.969 0.672
We now show that on any instance one of the algorithms is a 3
4
- approximation
algorithm:
prof. Jozef Gruska IV054 7. Probabilistic method I: 25/61
Let n1 denote the expected number of clauses that are
satisﬁed when each variable is independently set to 1 with
probability 1
2 (what corresponds to the ﬁrst algorithm).
Let n2 denote the expected number of clauses that are
satisﬁed when we use the linear programming followed by
the randomized rounding (what corresponds to the second
algorithm).
prof. Jozef Gruska IV054 7. Probabilistic method I: 26/61
FINAL RESULT II
Theorem:
max {n1, n2} ≥
3
4 j
ˆcj .
Proof: It suﬃces to show that (n1+n2
2
) ≥ 3
4 j ˆcj .
Let Sk denote the set of clauses that contain k literals. We know that
n1 =
k Cj ∈Sk
(1 − 2−k
)cj ≥
k Cj ∈Sk
(1 − 2−k
)ˆcj ,
n2 ≥
k Cj ∈Sk
βk ˆcj .
Thus
n1 + n2
2
≥
k Cj ∈Sk
(1 − 2−k
) + βk
2
ˆcj .
Since (1 − 2−k
) + βk ≥ 3
2
for all k, we get
n1 + n2
2
≥
3
4
k Cj ∈Sk
ˆcj =
3
4 j
ˆcj .
prof. Jozef Gruska IV054 7. Probabilistic method I: 27/61
RAMSEY NUMBER PROBLEM
The Ramsey number R(k, l) is the smallest integer n such that in any
2-coloring of the edges of a complete graph Kn, on n nodes, by red and
blue, there either is a red Kk (i.e. a complete subgraph on k vertices with
edges coloured red), or there is a blue Kl .
Ramsey (1930) showed that R(k, l) is ﬁnite for any two integers k and l.
We use the probabilistic method to show a lower bound on R(k, k).
prof. Jozef Gruska IV054 7. Probabilistic method I: 28/61
LOWER BOUND on the RAMSEY NUMBER
Theorem If n
k
· 21−(k
2) < 1, then R(k, k) > n.
Corollary Since n
k
· 21−(k
2) < 1 for n = 2k/2
(that is if k = 2 lg n), the above theorem
implies that R(k, k) > 2k/2
for all k ≥ 3.
Proof of Theorem Let n, k satisfy the assumption of the theorem. Consider a random
2-coloring of the edges of Kn by red or blue.
For any ﬁxed set R of k vertices, let AR be the event that the induced subgraph of Kn on
R is monochromatic (i.e. that either all its edges are red or they are all blue).
Clearly, Pr(AR ) = 2 1
2(k
2)
= 21−(k
2). Since there are n
k
possible choices for R, the
probability that at least one of the events AR occurs is at most n
k
21−(k
2) < 1. Thus, with
positive probability, no event AR occurs and therefore there is a 2-coloring of Kn without
a monochromatic Kk , that is, R(k, k) > n. Note that if k ≥ 3 and n = 2k/2
, then
n
k
21−(k
2) <
21+k/2
k!
nk
2k2/2
< 1
and hence R(k, k) > 2k/2
for all k ≥ 3.
prof. Jozef Gruska IV054 7. Probabilistic method I: 29/61
ALGORITHMIC CONSEQUENCES
Last theorem implies that there is an edge two-coloring of Kn without a monochromatic
K2 lg2 n. It is therefore natural to ask whether we can ﬁnd eﬃciently such a coloring.
Since there are 2
(
n
2
)
possible colorings, an exhaustive search cannot be eﬃcient.
However, a closer look at the proof of the last theorem shows that the proof can be used
to produce eﬀectively a coloring that is very likely to be good. This is due to the fact for
large k if n = 2k/2
, then
(
n
k
)2
1−(
k
2
)
<
21+ k
2
k!
(
n
2k/2
)k
≤
21+ k
2
k!
<< 1
because (
n
k
) ≤ nk
k!
. Hence, a random coloring of Kn is very likely not to contain a
monochromatic K2 lg2 n.
prof. Jozef Gruska IV054 7. Probabilistic method I: 30/61
A CONSEQUENCE
As a consequence of previous results, if we need to ﬁnd a
two-coloring of edges of K1024 without a monochromatic
K20 we can simply produce a random two-coloring and
then the probability that it contains a monochromatic K20
is less than 211
20! what is much, much less than probability of
error in any proof that a certain coloring is good.
prof. Jozef Gruska IV054 7. Probabilistic method I: 31/61
SOME RAMSEY’s NUMBERS
For some Ramsey’s numbers see
http://en.wikipedia.org/wiki/Ramsey’s theorem
For example,
R(3, 3) = 6, R(4, 4) = 18
43 ≤ R(5, 5) ≤ 49.
prof. Jozef Gruska IV054 7. Probabilistic method I: 32/61
THE PARTY PROBLEM
Ramsey problem is also called Party problem.
Nodes of a Kn graph are seen as a party participants and two of them are
connected by a red (blue) edge if they are friends (strangers).
Ramsey number R(k, l) is the smallest number n such that at any party of
n people there are at least k that are mutually friends and l that are
mutually strangers.
In 1993 S. P. Radziszowski and B. D. McKay showed that R(4, 5) = 25.
They estimate that their computer proof consumed an equivalent of 11
years of computation by a standard desktop computer.
prof. Jozef Gruska IV054 7. Probabilistic method I: 33/61
DELETION METHOD
So called deletion method can be useful in some cases when it seems to
be diﬃcult to apply the probabilistic method directly.
The proof, by the deletion method, that a certain combinatorial object O
exists, consists, conceptually, of two stages:
It is ﬁrst shown that with a positive probability an object O exists that
is very close, in some sense, to O.
Secondly, O is changed, to obtain O, and it is shown that the
probability of the existence of O remains positive.
prof. Jozef Gruska IV054 7. Probabilistic method I: 34/61
MIN-MAX TRIANGLE PROBLEM - I.
Let S = {p1, . . . , pn} be a set of points located in a unit square of the plane.
Consider a set TS of all triangles whose vertices are points of S and let T(S) be the area
of that triangle from TS the area of which is minimal.
The following theorem asserts that for any n there is a set S of n points such that T(S)
is not too small. Namely, it holds:
Theorem For any n there is a set S of n points in a unit square such that
T(S) ≥ 1
100n2 .
PROOF. BASIC IDEA: 2n points are chosen randomly in the unit square. All triangles
created by these points are tested and those that are “too small” are eliminated - by
deleting one vertex from each of them until only n of nodes are left. The idea is that this
way we will be left with enough points, and with no too-small-area triangle.
prof. Jozef Gruska IV054 7. Probabilistic method I: 35/61
MIN-MAX TRIANGLE PROBLEM - II.
PROOF Choose uniformly 2n points in the unit square.
For points p, q, r let A(p, q, r) denote the area of the triangle these points create.
For any real numbers 0 ≤ b and ∆b ≤ 1 it holds
Pr[b ≤ ||p − q|| ≤ b + ∆b] ≤ π(b + ∆b)2
− πb2
= 2πb∆b + π(∆b)2
,
where ||p − q|| denote the Euclidean distance between p and q.
(Observe that inequality follows from the fact that rings with radiuses b and b + ∆b and
centre in p may not be completely contained in the unit square.)
Let (p, q) be the base of the triangle (p, q, r) and let ||p − q|| = b.
prof. Jozef Gruska IV054 7. Probabilistic method I: 36/61
MIN-MAX TRIANGLE PROBLEM - III.
We show now how to estimate Pr[A(p, q, r) ≤ ε] for any ε > 0 and p.q.r.
Such an event happens when the height h of the triangle is ≤ 2ε
b
and therefore r is not
farther than 2ε
b
from the line of points p and q.
The probability that this happens is less than 2
√
2ε
b
, because r has to be in a strip of
width 2ε
b
and length less than
√
2. Hence,
Pr[A(p, q, r) ≤ ε] =
√
2
b=0
Pr[b ≤ ||p − q|| ≤ b + ∆b] × Pr[triangle. h. ≤
2ε
b
]
≤
√
2
b=0
(4πb)
2
√
2ε
b
∆b = 16πε.
prof. Jozef Gruska IV054 7. Probabilistic method I: 37/61
MIN-MAX TRIANGLE PROBLEM - IV
Let us now compute the expected number of triangles with the area ≤ ε = 1
100n2 .
Let S be a set of 2n points uniformly distributed in the unit square. For each triple
(pi , qi , ri ) in S let Xpi ,qi ,ri be the indicator variable having value 1 if the area of the
triangle determined by (pi , qi , ri ) is less than ε = 1
100n2 .
The probability that the area of some speciﬁc triangle is less than 1
100n2 is less than
16π
100n2 ≤ 0.6
n2 – This is also the expected value of Xpi ,qi ,ri .
prof. Jozef Gruska IV054 7. Probabilistic method I: 38/61
MIN-MAX TRIANGLE PROBLEM - V.
If X denotes the number of triangles with area less than 1
100n2 , then
E[X] =
p,q,r∈S
E[Xpi ,qi ,ri ] ≤
2n
3
0.6n−2
≤ n.
Finally, by throwing away an arbitrary vertex from each of such “small area
triangles”, we are left with a new set S of points the expected size of which (of
S ), is not less than n, in which no small-area-triangles exist.
Therefore, there exists a set S , of size n, such that T(S ) ≥ 1
100n2 .
prof. Jozef Gruska IV054 7. Probabilistic method I: 39/61
EXAMPLE - INDEPENDENT SETS OF VERTICES
We show a lower bound on the size of the largest independent set of vertices in certain
graphs.
Deﬁnition An independent set of a graph G = (V , E) is a subset of vertices of V such
that no two vertices in the set are adjacent. Denote by α(G) the size of the largest
independent set of vertices of the graph G.
Our aim is to prove the following lower bound:
Theorem If |V | = n and |E| = nk/2 for a graph G = (V , E), then α(G) ≥ n
2k
.
Proof ideas:
To choose randomly a subset of vertices that would be a candidate for an
independent set.
To show, using the probabilistic argument, that there is a subset of the chosen set of
vertices that has many more vertices than edges.
By deleting one vertex from each of such edges, an independent set is produced.
prof. Jozef Gruska IV054 7. Probabilistic method I: 40/61
PROOF
Create a set S ⊂ V by putting into S each vertex independently with probability p (to be
speciﬁed later). It therefore holds for the average size of S that: E[|S|] = np. Let GS be
the subgraph of G = (V , E), induced by S.
For any e ∈ E let Ye be the indicator variable that has value 1 if e ∈ E(GS ) - the set of
edges of GS - and 0 otherwise.
E[Ye ] = p2
because an edge belongs to E(GS ) iﬀ both if its endpoints are in S, what
happens with probability p2
. Let Y = |E(GS )|. It holds
E[Y ] = E[
e∈E
Ye ] =
e∈E
E[Ye ] =
nk
2
p2
.
After deleting all edges from GS , by dropping a vertex from each of such edges, it
remains a set S∗
of the expected size E[|S∗
|] = E[|S| − Y ], and therefore
E[|S| − Y ] = E[|S|] − E[Y ] = np −
nk
2
p2
.
The last expression has the largest value for p = 1
k
and in such a case
E[|S∗
|] = E[|S| − Y ] =
n
2k
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 41/61
EXPANDING GRAPHS
Informally, an expanding graph is a graph in which the number of neighbors of any
suﬃciently small set of vertices S is larger than c|S| for some positive constant c > 1.
OR-concentrators are a special type of expanding graphs.
Deﬁnition:An (n, d, α, c) OR-concentrator is a bipartite multigraph G = (L, R, E) with
independent sets of vertices L and R, each of cardinality n, such that
1 Every vertex in L has degree at most d;
2 For any subset S of vertices from L such that |S| ≤ αn there are more than c|S|
neighbors in R.
For applications, it is usually desirable to have d as small and c as large as possible.
Of particular interest is to study OR-concentrators in which α, c and d are constants
ﬁxed independently of n, with c > 1.
Finding an explicit construction of OR-concentrators is a non-trivial task. However, the
probabilistic method can be used to show the existence of such concentrators.
prof. Jozef Gruska IV054 7. Probabilistic method I: 42/61
Theorem: There is an integer n0 such that for all n > n0 there is an (n, 18, 1
3
, 2)
OR-concentrator.
Proof: The ﬁrst part of the proof will be for all (n, d, c, α) OR-concentrators.
Consider a random bipartite graph with two disjoint sets of vertices, L and R, each
of n vertices, in which each vertex of L chooses randomly and independently d
vertices from R as neighbours.
For any integer s let ξs denote the event that a (bad) subset of s vertices of L has
fewer than cs neighbours in R.
We ﬁrst derive an upper bound on Pr[ξs ] and then an upper bound on the sum of Pr[ξs ]
over all s ≤ αn. This way we obtain a non-trivial upper bound on the probability
that a random graph fails to be an OR-concentrator with parameters we seek.
Fix any subset S ⊆ L of size s, and any subset T ⊆ R of size cs.
(There are
n
s
ways of choosing S, and
n
cs
ways of choosing T.)
The probability that T contains all of at most ds neighbours of the vertices in S is (cs
n
)ds
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 43/61
The probability of the event that all ds edges going out from some s vertices of L fall
within any cs vertices of R is bounded by
Pr[ξs ] ≤
n
s
n
cs
cs
n
ds
.
Using the inequality
n
s
≤ ne
s
s
we obtain
Pr[ξs ] ≤
n
s
n
cs
cs
n
ds
≤
ne
s
s ne
cs
cs cs
n
ds
=
s
n
d−c−1
e1+c
cd−c
s
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 44/61
In a special case, for α = 1
3
and s ≤ αn we have
Pr[ξs ] =
1
3
d−c−1
e1+c
cd−c
s
≤
c
3
d
(3e)c+1
s
,
and, in addition, for c = 2, d = 18
Pr[ξs ] =
2
3
18
(3e)3
s
.
Since:
r =
2
3
18
(3e)3
≤
1
2
we have
s≥1
Pr[ξs ] ≤
s≥1
rs
=
r
1 − r
< 1.
prof. Jozef Gruska IV054 7. Probabilistic method I: 45/61
OBLIVIOUS ROUTING REVISITED - I.
As another example, we will show that probabilistic method can be used to prove the
existence of an algorithm.
We focus on the minimal number of random bits needed by randomized oblivious
routing algorithms for hypercubes. and we derive the following results.
1 A proof of the existence (by the probabilistic method) of a randomized routing
algorithm that uses (within a constant factor) only 3d random bits to route
d-dimensional hypercube.
As a consequence we get that our randomized oblivious routing algorithm, from
previous chapter, that used d2d
random bits to route a d-dimensional hypercube,
uses much too much random bits.
prof. Jozef Gruska IV054 7. Probabilistic method I: 46/61
OBLIVIOUS ROUTING REVISITED - II.
To remember: 2d
d
is a lower bound on oblivious deterministic routings on a
d-dimensional hypercube Hd .
Question: How much randomness (how many random bits) is (are) needed to have a
routing algorithm with the expected running time O(d)?
Observation I: A randomized oblivious algorithm for permutation routing is a probability
distribution on a set of deterministic oblivious routing algorithms.
Observation II. Each deterministic oblivious algorithm for a 2d
-node network is a set of
22d
= 2d
× 2d
routes, one for each source-target pair.
Note: Every randomized oblivious algorithm can be expressed by sequences
(A1, . . . , Ar ), (p1, . . . , pr ),
where each Aj is a deterministic oblivious routing algorithm and each pj is the probability
that we use Aj on a run of the randomized routing algorithm.
prof. Jozef Gruska IV054 7. Probabilistic method I: 47/61
Note: We know:
1 With 0 random bits the expected running time of any oblivious routing
algorithm on Hd is Ω 2d
d ;
2 For the randomized oblivious routing the expected running time is
O(d) and d 2d random bits are used (each of 2d nodes chooses a
random d-bit auxiliary goal).
Are so many random bits indeed necessary for eﬃcient randomized routing?
prof. Jozef Gruska IV054 7. Probabilistic method I: 48/61
Theorem: For every d there exists a randomized oblivious scheme (algorithm) for a
permutation routing on the hypercube with n = 2d
nodes that uses only 3d random bits
and still runs in the expected time 15d at most.
Proof: Notation We say that a set B = {B1, . . . , Bt } of deterministic oblivious
permutation routing algorithms on Hd is an eﬃcient routing scheme, if for any
input instance, the expected number of steps using a randomly chosen algorithm
from B is at most 15d.
To prove the theorem, we show that for every n = 2d
there is an eﬃcient routing
scheme for Hd with t = 23d
= n3
.
Our resulting randomized routing scheme will randomly choose n3
of nn
possible
deterministic oblivious routing algorithms. (nn
is due to the fact that there are n
sources and for each one we can choose from n possible intermediate destinations.)
Let us denote such deterministic algorithms by Aj , 1 ≤ j ≤ nn
. On an n-node network
there are n! distinct possible instances of the permutation routing problem, one for each
permutation on {1, 2, . . . , n}.
prof. Jozef Gruska IV054 7. Probabilistic method I: 49/61
For a permutation πi , 1 ≤ i ≤ n!, let us call a deterministic oblivious routing algorithm
Aj good if Aj routes πi in 14d or fewer steps, and bad otherwise.
By our randomized routing result: (with probability at least 1 − 1
n
every packet reaches
its destination in 14n or fewer steps) for any particular πi a fraction of at most 1
n
of the
algorithms Aj are bad - which of them are bad may diﬀer from instance to instance.
Experiment:Choose n3
indices i1, . . . , in3 , randomly, independently and uniformly from
the set {1, . . . , nn
}. We show that the set of deterministic algorithms
A = {Ai1 , . . . , Ain3 }
is an eﬃcient routing scheme with a positive probability. This will imply that an eﬃcient
routing scheme exists for any n = 2d
.
For any πi , a fraction of at most 1
n
of the algorithms A1, . . . , Ann is bad. Therefore, the
expected number of algorithms in A that are bad for πi is at most n3
· 1
n
= n2
.
prof. Jozef Gruska IV054 7. Probabilistic method I: 50/61
Let the indicator variable Xj be 1 if Aij is bad, 1 ≤ j ≤ n3
, and 0 otherwise. Thus
E[ j Xj ] ≤ n2
. Since Xj are independent, we may apply Chernoﬀ bound on X = Ai to
get
Pr[X ≥ (1 + δ)µ] ≤ F+
(µ, δ) < e− µδ2
4
(Chapter 6, page 10), to obtain (for µ = n2
, δ = 1) that the probability that more than
2n2
of the algorithms in A are bad for πi is ≤ e
−n2
4 . Let Bi denote the bad event that
more than 2n2
algorithms in A are bad for πi . Then, for n ≥ 4
Pr[
n!
i=1
Bi ] ≤
n!
i=1
Pr(Bi ) ≤ n! e− n2
4 < 1 (by Stirling’s formula for n!)
Therefore, with positive probability, not more than 2n2
algorithms in A are bad for any
πi . Hence, there exists a subset of n3
algorithms from {A1, . . . , Ann } with the property
that at most 2n2
in this subset are bad for any πi .
Let us denote this subset B = {Bi1 , . . . , Bin3 }. B is an eﬃcient routing scheme: for any
πi a randomly chosen algorithm form B fails to route πi within 14n steps with probability
at most 2n2
n3 = 2
n
.
From that one can deduce that the expected number of steps using an algorithm
randomly chosen from B is less than 15n.
prof. Jozef Gruska IV054 7. Probabilistic method I: 51/61
THE LOV´ASZ LOCAL LEMMA
prof. Jozef Gruska IV054 7. Probabilistic method I: 52/61
THE LOV´ASZ LOCAL LEMMA - MOTIVATION I
This is one of the most elegant and useful tools to apply the probabilistic method.
Suppose that we have a ﬁnite set A of bad events such that each of them may not
happen with a non-zero probability. We want to show that under certain circumstances
none of these events happen.
This is easy to show for the case events are independent. Indeed, in such a case
Pr


A∈A
¯A

 =
A∈A
Pr[ ¯A] > 0.
Lov´asz Local lemma handle the situation for the case where events are generally not
independent of each other, but each collection of events that are not independent of
some particular event A has low total probability.
The proof of the original version of Lov´asz Local Lemma was non-constructive - it gave
no guidance how to ﬁnd a particular outcome that makes all the events false.
Later, it has been shown that when events are determined by some underlying set of
independent variables and independence between two events is detected by having
non-overlapping sets of underlying variables, an actual solution can be found in
polynomial expected time.
prof. Jozef Gruska IV054 7. Probabilistic method I: 53/61
THE LOVASZ LOCAL LEMMA - SYMMETRIC VERSION
In order to formulate so called symmetric version of the lemma we need the following
deﬁnition of mutual independence. An event E is mutually independent of events
E1, . . . , En, if for any I ⊆ [1, n],
Pr E |
i∈I
Ei = Pr(E).
Dependency between events can be represented in terms of a dependency graph.
Deﬁnition: Dependency graph for a set of events E1, . . . , En is a graph G = (V , E) such
that V = {1, . . . , n} and for i = 1, . . . , n, the event Ei is mutually independent of the
events {Ej | (i, j) ∈ E}.
Lovasz Local Lemma Let E1, . . . , En be events and p, d be ﬁxed numbers such that
for all i, Pr(Ei ) ≤ p;
the degree of the dependency graph given by E1, . . . , En is bounded by a d;
4dp ≤ 1.
Then Pr n
i=1
¯Ei > 0.
prof. Jozef Gruska IV054 7. Probabilistic method I: 54/61
APPLICATION - EDGE-DISJOINT PATHS
Assume that n pairs of users need to communicate using edge-disjoint
paths, from a ﬁxed set of paths, on a given network .
Assume that each ith pair of users can choose a path from a collection
Fi of m paths for a ﬁxed m.
We show, using simple version of Lovasz local lemma that, if the
possible paths do not share too many edges, then there is a way to
choose n edge-disjoint paths connecting the n given pairs.
Theorem: If, for a ﬁxed k, any path in any Fi shares edges with no more
than k paths in any Fj , j = i, and 8nk/m ≤ 1, then there is a way to
choose n edge-disjoint paths connecting the n given pairs.
prof. Jozef Gruska IV054 7. Probabilistic method I: 55/61
PROOF of THEOREM
Let each ith pair chooses a path from Fi randomly, with probability 1
m
.
Let Ei,j be the event that paths chosen by pairs i and j share at least one edge.
Since for any i and j = i the path chosen from Fi shares edges with at most k paths
in any Fj , j = i, we have
p = Pr(Ei,j ) ≤
k
m
.
Let d be the degree of the dependency graph of all events Ei,j . Since the event Ei,j
is independent of all events Ei ,j , when i ∈ {i, j} and j ∈ {i, j}, we have d < 2n.
Since
4dp <
8nk
m
≤ 1
all conditions of the Lovacz local lemma are satisﬁed and therefore
Pr


i=j
¯Ei,j

 > 0
and therefore there is such a choice of n paths that are edge-disjoint.
prof. Jozef Gruska IV054 7. Probabilistic method I: 56/61
THE LOV´ASZ LOCAL LEMMA - GENERAL CASE
Suppose we have n events each of which occur with probability at most 1
2
. Let each of
these events corresponds to one of n ways in which a probabilistic experiment could fail.
If the event were independent, we could then assert that at such an experiment with
probability at least 2−n
, none of the events occurs.
The Lov´asz Local Lemma generalizes the above setting and result to the case where each
of the events is independent of all but small number of other events.
Lov´asz Local Lemma: Let G = (V , E) be the dependency graph for events ξ1, . . . , ξn in
a probability space. Suppose that there exist xi ∈ [0, 1], for 1 ≤ i ≤ n, such that
Pr[ξi ] ≤
(i,j)∈E
(1 − xj ).
Then
Pr[
n
i=1
¯ξi ] ≥
n
i=1
(1 − xi ).
prof. Jozef Gruska IV054 7. Probabilistic method I: 57/61
PAUL ERD¨OS
Paul Erd¨os, a Hungarian mathematician, died in September 1996 at the
age of 83.
He was the most proliﬁc mathematicians of the twentieth century with
over 1500 papers written and more than 490 collaborators.
He can be seen as the founder (in 1947) of the probabilistic method.
prof. Jozef Gruska IV054 7. Probabilistic method I: 58/61
THOUGHTS ABOUT ERD¨OS
In our century, in which mathematics is so strongly dominating by
”theory constructors” Erd¨os remained the prince of problem solvers and
the absolute monarch of problem posers. (E. Strauss)
Erd¨os had no job; he worked constantly. He had no home, the world
was his home. Possessions were a nuisance, money a bore. He lived on
a web of trust, traveling ceaselessly from Center to Center, spreading
his mathematical pollen. His enormous talents and energies were given
entirely to the Temple of Mathematics.
prof. Jozef Gruska IV054 7. Probabilistic method I: 59/61
STORIES about ERD¨OS
It is six in the morning. The house is sleeping. I prove and conjecture.
Paul Erd¨os in a letter to Vera S´os
prof. Jozef Gruska IV054 7. Probabilistic method I: 60/61
STORY about the BOOK
Story about The Book. Erd¨os liked to talk about The Book, that
contains all theorems of mathematics and for each of them one proof beautiful,
aesthetic and insightful - so called The Book proof.
Each time one of his conjectures was resolved in ”an ugly way” Erd¨os
congratulated the prover, but added ”let us now look for a Book proof”.
In 1985 he started his lecture in a math camp by saying: ”You do not
have to believe in the God, but you should believe in The Book”.
prof. Jozef Gruska IV054 7. Probabilistic method I: 61/61