Part I
2. Types and Basic Design Methods for Randomized Algorithms
Chapter 2. DESIGN METHODS for RANDOMIZED ALGORITHMS
In this chapter:
1 Main types of randomized algorithms are introduced
and illustrated.
2 Main design methods for randomized algorithms are
introduced and illustrated.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 2/50
BASIC PROBABILITIES FOR RANDOMIZED ALGORITHMS
Let us consider any decision problem P.
For any randomized algorithm A for P, and for any input x of A, let the set SA,x of
all runs of A on x be the main sample space for the analysis of A on the input x.
If one chooses the random variable VA,x that assigns 1 (0) to any run of A on x with
the correct (wrong) output, then the expectation value of VA,x is exactly the success
probability of A on x.
The probability of the complementary event is called the error probability of A on x.
If one takes a random variable that assigns to every computation its complexity
(number of steps), then the expectation of this random variable equals the expected
time complexity of A on x.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 3/50
CLASSIFICATION of RANDOMIZED ALGORITHMS
As Las Vegas algorithms are called those algorithms that never produce wrong
outcomes, though sometimes they may produce the outcome “I don’t know”
(usually denoted as ??). (However, with bounded probability only.)
A one-sided-error Monte Carlo algorithm (1MC), for a language L, is an algorithm
that accepts with probability at least 1
2
any input x ∈ L and rejects for sure any
input not in L; (The error probability of such algorithms converges to 0 with
exponential speed if a number of independent runs are executed.)
A bounded-error Monte Carlo algorithm (2MC) A, for a function F, is an algorithm
for which there exists a constant ε > 0 such that, for any input x, the algorithm
computes the correct output A(x) = F(x), with probability at least 1/2 + ε. (The
error probability of such an algorithm can be reduced to an arbitrarily small given
constant δ using only constantly many (depending on δ) runs.)
A (unbounded error) Monte Carlo algorithm (UMC) is an algorithm that, for any
input x, computes the correct output with probability at least 1/2. (To reduce the
error probability of such an algorithm below a given constant, exponentially many
runs may be needed.)
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 4/50
RANDOMIZED ALGORITHMS DESIGN PARADIGMS - I
Fooling the adversary (balamuting nepriatela) method: One ﬁnds for a given problem P a
set of deterministic algorithms for P such that, for each input most of these algorithms
compute correct results of P eﬃciently
and one then takes as the randomized algorithm for P a probability distribution over such
a set of deterministic algorithms for P.
The idea is to overcome such a situation when a set S of deterministic algorithms for P
is given such that for each of them there exist bad inputs (for which the algorithm gives
wrong result or computes ineﬃciently) and for each input there exist a lot of algorithms
in S giving a correct answer eﬃciently).
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 5/50
RANDOMIZED ALGORITHMS DESIGN PARADIGMS - II
Abundance of witnesses technique. A witness y for an input x and problem P is
information with which one can solve the problem P for input x more eﬃciently than
without it.
If one ﬁnds a set S such that at least half of its elements are witnesses for P and x, then
a random choice of an element in the set S leads to a witness for P and x, with
probability at least 1
2
.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 6/50
RANDOMIZED ALGORITHMS DESIGN PARADIGMS - III
Fingerprinting. For solving various problems it can be
much more eﬃcient to work with very small ﬁngerprints
(hashes) of large inputs than with such large inputs
directly.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 7/50
RANDOMIZED ALGORITHMS DESIGN PARADIGMS - IV
Random sampling. If there are in a set suﬃciently many objects we are looking for,
then a random sampling in that set can provide such an object with suﬃciently high
probability. This method is also called probabilistic method.
colorred Random rounding.: A hard to solve optimization problem P is transferred
to an easy to solve optimization problem P0, just by increasing the size of the
solution space, in such a way that the outcomes of any solution of the new problem
can be used to create an eﬃcient randomized algorithm to solve the original problem
P.
An ampliﬁcation of the success probability of a randomized algorithm can be
achieved by repeating independent computations on the same input (but, of course,
with diﬀerent random auxiliary inputs).
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 8/50
Example for random rounding
A {0, 1}-problem P, in which the task is to ﬁnd a proper
assignment of the values from the set {0, 1} to each
variable, is transferred to a [0, 1]-problem in which the
task is to assign to each variable x a number nx ∈ [0, 1]
such that to solve the problem P with high probability the
value 1 is assigned with the probability nx.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 9/50
STRINGS EQUALITY PROBLEM
Notation: For a binary vector/string x = x1x2 . . . xn let
Number(x) =
n
i=1
xi 2n−i
.
Problem: Let each of the two parties, say A and B have a one n-bit string. Both parties
have to decide, by communication only, whether their strings are equal. How to do that
eﬃciently?
Each deterministic protocol clearly requires sending at least n bits in the worth case.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 10/50
STRINGS EQUALITY PROBLEM - RANDOMIZED PROTOCOL
Initial situation.
Party A has a binary string x = x1x2 . . . xn.
Party B has a binary string y = y1y2 . . . yn.
Protocol:
1 Alice chooses, randomly, a prime p ≤ n2
and sends to Bob p and the binary
representation of the number
s = Number(x) mod p;
2 Bob computes
t = Number(y) mod p
and declares that x = y iﬀ s = t.
Analysis: The protocol requires to send at most
2 lg n2
≤ 4 lg n bits.
Example: For n = 264
≈ 1021
the protocol requires to send at most 256 bits.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 11/50
ERROR ANALYSIS
Let us say that a prime 2 < p < n2
is bad for a pair (x, y), x = y, if the above protocol
for such an input pair (x, y) and such a choice of a prime yields a wrong answer.
Notation Prim(m) = number of primes smaller than m.
Error probability for an input (x, y) is
number of bad primes for (x, y)
Prim(n2)
.
Since, by Prime number theorem, Prim(m) > m/ ln m for m > 69, we have that for n ≥ 9
Prim(n2
) >
n2
2 ln n
.
The so-called Prime number theorem says that there are approximately n
ln n
primes among
the ﬁrst n integers.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 12/50
NUMBER Of BAD PRIMES
Lemma Number of bad primes for x = y is at most n − 1.
Proof. A prime p is clearly bad for the pair (x, y), x = y, if and only if p divides the
number
w = |Number(x) − Number(y)| < 2n
Observe that w can be uniquely factorized as
w = pe1
1 pe2
2 . . . p
ek
k
where p1 < p2 < . . . < pk are primes.
We can show that k ≤ n − 1. Indeed, if k ≥ n, then
w ≥ 1 · 2 · 3 . . . n = n! > 2n
what is a contradiction.
Probability of the error of the above equality protocol is therefore:
n − 1
Prim(n2)
≤
n − 1
n2/ ln n2
≤
ln n2
n
=
2 ln n
n
which is at most 0.369 · 10−14
in case n = 1016
.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 13/50
PROBABILITY AMPLIFICATION
In case the above protocol is repeated 10 times, each time with a diﬀerent prime, and the
answer is x = y each time, then the probability of an error is at most
ln n2
n
10
and therefore, for n = 1016
, the error probability is at most
0.47 · 10−141
.
Reminder: Probability of the correct output is at least
1 −
2 ln n
n
.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 14/50
A LAS VEGAS ALGORITHM WITH ?? - EXAMPLE
Let Alice have 10 n bit strings xi ∈ {0.1}n
, 1 ≤ i ≤ 10, n > 200 and Bob have 10 n bit
strings yi ∈ {0, 1}n
, 1 ≤ i ≤ 10. The task for them is to determine, by communication,
whether there exists an i0 such that xi0 = yi0 .
One can show that each deterministic protocol for this problem has to exchange in the
worth case 10n bits. Therefore no deterministic algorithm is essentially more eﬃcient
than sending all bits from Alice to Bob and then let Bob to make comparisons of pairs xi
and yi until (if at all) such an i0 is found..
We show the existence of a Las Vegas algorithm to solve the above problem (that is to
decide whether such an i exists) with communication complexity
n + O(lg n).
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 15/50
PROTOCOL
Alice ﬁrst randomly chooses 10 primes p1, . . . , p10, each smaller than n2
. Afterwards,
Alice computes all
si = Number(xi ) mod pi , i ∈ {1, .2, . . . , 10}
and, ﬁnally, she sends to Bob all 20 numbers:
p1, . . . , p10, s1, . . . , s10.
Bob computes all numbers
ri = Number(yi ) mod pi
and compares elements in all pairs (si , ri ).
If si = ri for all i, then Bob outputs NO (or 0).
Otherwise, if j is the smallest integer such that sj = rj , then Bob sends to Alice the
pair (j, yj ).
If xj = yj , Alice outputs YES (or 1);
otherwise she outputs ?? (because it may exist other k with xk = yk ).
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 16/50
COMPLEXITY ANALYSIS of the PROTOCOL 1/2
Above protocol exchanges 20 lg n2
+ n + 4 bits - {4 bits one needs to send j}.
Proof that the protocol is a Las Vegas protocol:
If xi = yi for all i, then the probability that
Number(xi ) mod pi = Number(yi ) mod pi
for all i, and therefore the probability that the above protocol produces the correct
output is at least
1 −
2 ln n
n
10
because, according to the analysis of the strings equality algorithm, the probability
that for any 1 ≤ i ≤ 10 Number(xi ) mod pi = Number(yi ) mod pi is at least
1 − 2 ln n
n
.
Moreover, it can be shown that for for suﬃciently large n.
1 −
2 ln n
n
10
≥ 1 −
20 ln n
n
≥
1
2
.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 17/50
COMPLEXITY ANALYSIS of the PROTOCOL 2/2
Let us now consider the complementary case - namely that there is a j such that
xj = yj and let j0 be the smallest such j.
Protocol then accepts the input iﬀ
Number(xi ) mod pi = Number(xi ) mod pi
for all i < j0. Let us denote such an event by Ej0 .
If j0 = 1, then the protocol accepts the input with certainty. If j0 > 1, then, as
discussed before, probability of the event Ej0 is at least
1 −
2 ln n
n
j0−1
≥ 1 −
2(j0 − 1) ln n
n
for suﬃciently large n
and therefore the protocol outputs YES (1) with probability at least 1 − 18 ln n
n
, which
is larger than 1
2
for all n ≥ 189.
In the again complementary case, when there is an l < j0 such that
Number(xl ) mod pl = Number(xl ) mod pl
the protocol produces as output ?? and is indeed Las vegaas protocol.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 18/50
TWO TYPES of LAS VEGAS ALGORITHMS
As already deﬁnes, there are two types of Las Vegas algorithms:
Algorithms that never produce ??.
Algorithms that may produce ??.
Note: Las Vegas algorithms may not terminate for some inputs!!!
Claim: Any Las Vegas algorithm A1 can be converted to a Las Vegas algorithm A2 that
solves the same problem and never produces ??.
Construction of A2 is simple. Each time A1 is to produce the output ??, what can be
done only with bounded probability, a new run of A1 is initialized with the same input.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 19/50
AMPLIFICATION of 1MC ALGORITHMS
Error probability of 1MC algorithms decreases exponentially with the number of
repetitions of computations.
Indeed, if we have k independent runs of the algorithm and one output is 1 (accepted),
then the input is accepted with certainty.
If all outputs are 0 (rejection), then the output will be NO (rejection) and the error
probability is at most 1
2
k
.
Because of the exponential decrease of error probability using repeated applications, 1MC
algorithms are very popular.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 20/50
AMPLIFICATION of 2MC ALGORITHMS
Let A be a 2MC algorithm for a function F and ε > 0 such that
Prob(A(x) = F(x)) ≥
1
2
+ ε.
For any integer k let Ak be the algorithm that performs k independent runs of A and if
there is an α that appears at least k
2
times as the output, then Ak produces α as the
output; if there is no such an α, Ak produces ?? as the output.
One can show that if an δ > 0 is ﬁxed, then
Prob(Ak (x) = F(x)) ≥ 1 − δ
if
k ≥
2 ln δ
ln(1 − 4ε2)
.
If ε and δ are considered as constant, then so is k and therefore
TimeAk (n) = O(TimeA(n)).
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 21/50
2MC versus UMC algorithms
Basic question: What is the diﬀerence between 2MC and UMC algorithms?
Answer: For an UMC algorithm A it may happen that the distance between the error
probability and 1
2
tends to 0 with growing input size.
As a consequence, if we design, given an δ > 0, an algorithm Ak , that performs k
independent runs of A and
Prob(Ak (x)) = F(x)) > 1 − δ
then running time of Ak may be exponential in the input length.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 22/50
SECRET SHARING between TWO
Problem: The task is to ”partition”, by a moderator, a given binary-string secret S
between two parties P1 and P2 in such a way that none of the parties alone has the
slightest idea what S is, but if they get together they can easily determine S.
Method: A moderator distributes a binary-string secret S, between two parties P1 and
P2 by choosing a random binary string b, of the same length as S, and sends:
b to P1 and
S ⊕ b to P2.
This way, none of the parties P1 and P2 alone has a slightest idea about S, but both
together easily recover S by computing
b ⊕ (S ⊕ b) = S.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 23/50
PERFECT MATCHING ALGORITHM - I.
Let G = V , E be an undirected graph. A subset X ⊆ E is said to be a matching of G if
no two edges in X have a common node.
A matching is said to be a perfect matching if its edges cover all nodes of G.
Example: Which of the graphs in the next ﬁgure has a perfect matching?
There are polynomial time algorithms to decide whether a given graph has perfect
matching, but none is so simple as the randomized algorithm based on so called Tutte
theorem presented bellow.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 24/50
TUTTE MATRIX
Basic concept: Tutte matrix of a graph
Let G = V , E be a graph with nodes V = {1, 2, . . . , n}, |V | even. The Tutte matrix
AG = {aij }n
i,j=1 of G is deﬁned by
aij =



xij if (i, j) ∈ E, i < j;
−xij if (i, j) ∈ E, i > j;
0 if (i, j) ∈ E



,
where xij are variables, all diﬀerent for diﬀerent i, j.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 25/50
TUTTE THEOREM - I.
Example: For the graph of six nodes at which node 1 is connected with nodes 4 and 5,
node 2 with nodes 5 and 6 and the node 3 is connected with nodes 5 and 6 the Tutte
matrix has the form:








0 0 0 x14 x15 0
0 0 0 0 x25 x26
0 0 0 0 x35 x36
−x14 0 0 0 0 0
−x15 −x25 −x35 0 0 0
0 −x26 −x36 0 0 0








Tutte theorem: A graph G = V , E , with |V | even, has a perfect matching iﬀ the
determinant of the corresponding Tutte matrix is not identically zero.
Proof: The determinant of AG equals π σπ
n
i=1 aiπ(i) where π are permutations of
{1, 2, . . . , n} and σπ = 1 (σπ = −1) if π is a product of an even (odd) number of
transpositions.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 26/50
TUTTE THEOREM - OBSERVATIONS
Observation 1: For a permutation π, n
i=1 aiπ(i) = 0 iﬀ Gπ = {(i, π(i)), 1 ≤ i ≤ n} is a
subgraph of G.
Observation 2: Permutations π with at least one odd cycle do not contribute at all to
the determinant of A, because to each such permutation π there is a permutation π
such that n
i=1 aiπ(i) = − n
i=1 aiπ (i)
Observation 3: It is suﬃcient to consider permutations π such that Gπ consists only of
even cycles.
Notation: Let permutation πr
be obtained from π by reversing all cycles.
Notation For a perfect matching E let tE denote the product of the a s corresponding
to the edges of E .
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 27/50
TUTTE THEOREM - CASE ANALYSIS
Case I: π = πr
⇒ Gπ consist of the cycles of length 2, π corresponds to a perfect
matching E such that n
i=1 aiπ(i) = (tE )2
.
Case II: π = πr
In this case both π and πr
correspond to the union of two perfect
matchings E, and E obtained by alternatively selecting edges within the cycles so that
n
i=1
aiπ(i) +
n
i=1
aiπr (i) = 2tE tE .
Conclusion;
det(AG ) = (tE1
+ · · · + tEk
)2
where Ei denotes i-th perfect matching.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 28/50
EXPLANATION
Let us illustrate claims from the previous slide for a graph with 4 nodes: 1, 2, 3 ,4
Case 1. Let us have edges 1-2, 3-4 only and permutation
π = πr
: 1 → 2, 2 → 1, 3 → 4, 4 → 3
Perfect matching E is 1 − 2, 3 − 4, tE = a12a34 and
4
i=1
aiπ(i) = a12a21a34a43 = x12x12x34x34 = (tE )2
Case 2. Let us have edges 1-2, 2-3, 3-4, 4-1 and permutations:
π : 1 → 2 → 3 → 4 → 1, πr
: 1 → 4 → 3 → 2 → 1
tE = a12a34, tE = a41a23
4
i=1
aiπ(i) +
4
i=1
aiπr (i) = a12a23a34a41 + a12a23a34a41 = 2tE tE .
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 29/50
RANDOM SELECT
Problem Given is a set S = {a1, . . . , an} of n > 0 diﬀerent numbers, and 1 ≤ k ≤ n, ﬁnd
the k-th smallest number from S.
A naive way to solve the problem is to sort at ﬁrst S. This requires O(n lg n)
comparisons. The following randomized algorithm RSELECT can do that in O(n) steps:
Algorithm RSELECT(S, k))
1 If n = 1 output a1.
2 Otherwise choose i ∈r {1, 2, . . . , n} and
1 compute
S< = {b ∈ S | b < ai }
S> = {b ∈ S | b > ai }
2 if |S<| ≥ k then RSELECT(S<, k);
else if |S<| = k − 1 then output ai ; else
RSELECT(S>, k − |S<| − 1)
This is clearly a Las Vegas algorithm.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 30/50
CLASSIFICATION of RANDOMIZED OPTIMIZATION
ALGORITHMS
Classiﬁcation of the randomized algorithms we had so far was based on the frequency of
correct outputs and it is suited only for classiﬁcation of algorithms for decision problems
and for computation of functions, but not for optimization problems.
In case of optimization problems, we do not take as the output the most frequent output
from several runs, but the best output according to some optimization criterion.
Moreover, in case of optimization problems our goal is not always to ﬁnd an optimal
solution. We are usually quite happy to ﬁnd (and in a feasible way) an almost optimal
solution - whose cost (quality) does not diﬀer much from the cost (quality) of some
optimal solution.
All that means that a diﬀerent approach to the classiﬁcation of randomized
approximation algorithms is needed.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 31/50
ILLUSTRATION
If a randomized algorithm A computes an optimal solution for an input x only with
probability at least 1
|x|
, then it does not mean that A is not useful.
Indeed, one can execute |x| independent runs of A on given input x, call that to be the
Ax algorithm, and then take the best output of all of them. Let us explore now what is
the probability of success of such an Ax algorithm.
Probability of not computing an optimal solution in one run is at most
1 −
1
|x|
and therefore the probability that Ax does not ﬁnd an optimal solution in |x| independent
runs is at most
(1 −
1
|x|
)|x|
<
1
e
and we have a constant probability 1 − 1
e
of computing an optimal solution.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 32/50
DEFINITION of OPTIMIZATION PROBLEMS
Deﬁnition: An optimization problem is a 6-tuple P = (ΣI , ΣO , L, M, cost, goal), where
1 ΣI is an input alphabet;
2 ΣO is an output alphabet;
3 L ⊆ Σ∗
I is the language of feasible inputs and any x ∈ L is called a problem instance
of P;
4 M is a function from L to 2Σ∗
O , and for each x ∈ L, M(x) is the set of feasible
solutions for x;
5 cost is a function: x∈L(M(x) × x) → R+
, called the cost function;
6 goal ∈ (minimum, maximum) is an objective.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 33/50
SOLVING an OPTIMIZATION PROBLEM
Observe that to solve an optimization problem is more as computing a relation, than as
computing a function, because we are usually happy with ﬁnding one of the optimal
solutions or even with a solution quite close to an optimal one.
A feasible solution α ∈ M(x) is called optimal for the problem instance x of P if
cost(α, x) = goal{cost(β, x) | β ∈ M(x)}.
An algorithm A is said to solve P if, for any x ∈ L,
A(x) ∈ M(x);
cost(A(x), x) = goal{cost(β, x) | β ∈ M(x)}
If the goal is to ﬁnd minimum (maximum) we talk about a minimization (maximization)
problem.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 34/50
EXAMPLE - MINIMUM COST HAMILTONIAN CYCLE PROBLEM
Input: A weighted complete graph (G, c), where G = (V , E), V = {v1, . . . , vn},
E ⊂ V × V and c : E → N is a cost function.
Hamiltonian cycles: For any problem instance (G, c), let M(G, c) be the set of
Hamiltonian cycles of G - each such cycle is represented by a sequence of vertices
vi1 , vi2 , . . . , vin , vi1 , where i1, i2, . . . , in is a permutation of (1, 2, . . . , n) and for each j,
(vij , vij+1 ) is an edge of G
Costs of Hamiltonian cycles: For every Hamiltonian cycle
H = (vi1 , vi2 , . . . , vin , vi1 ) ∈ M(G, c)
cost((vi1 , vi2 , . . . , vin , vi1 ), (G, c)) =
n
j=1
c((vij , vi(j+1) mod n
)).
Goal: minimum - to ﬁnd a Hamiltonian cycle with minimum cost.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 35/50
EXAMPLE
v1
v2
v3
v4
v5
8
1
2
71
12
3
1
1
cost((v1, v2, v3, v4, v5, v1), (G, c)) = 19
cost((v1, v5, v3, v2, v4, v1), (G, c) = 5
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 36/50
INTEGER LINEAR PROGRAMMING
Given is a system of linear equations and a linear function over variables of this
equations. The task is to ﬁnd a solution to the system of equations such that the value
of the linear function is minimized. More formally:
Input An m × n matrix and two vectors
A = {aij }i=m,j=n
i,j=1 , b = (b1, . . . , bm)T
, c = (c1, . . . , cn)
with integer entries.
Set of feasible solutions: M(A, b) = {X = (x1, . . . , xn)T
| AX = b}.
Cost of a solution: For X = (x1, . . . , xn) ∈ M(A, b)
cost(X, c)) = c · X =
n
i=1
ci xi .
Goal: minimum
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 37/50
QUANTIFICATION of “ALMOST OPTIMAL SOLUTIONS” 1/2
Many very important optimization problems are NP-hard and so we know only
exponential time algorithms for ﬁnding optimal solutions.
New idea is to jump from exponential to polynomial time by weakening the requirements
- to be satisﬁed with almost optimal solutions. To quantize that tries the next deﬁnition.
Deﬁnition Let P = (ΣI , ΣO , L, M, cost, goal) be an optimization problem. We say that
A is a consistent algorithm for P if, for every x ∈ L, the output A(x) is a feasible solution
for x - that is A(x) ∈ M(x).
We say that an approximation algorithm A mapping each instance x of an optimization
problem P to one of its feasible solutions has the ratio bound ρA(n) and the relative error
bound εA(n) if
max
|x|=n
cost(A(x))
cost(Opt(x))
,
cost(Opt(x))
cost(A(x))
≤ ρA(n)
and
max
|x|=n
|cost(A(x)) − cost(Opt(x))|
max{cost(Opt(x)), cost(A(x))
≤ εA(n)
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 38/50
Both deﬁnitions
max
|x|=n
cost(A(x))
cost(Opt(x))
,
cost(Opt(x))
cost(A(x))
≤ ρA(n)
and
max
|x|=n
|cost(A(x)) − cost(Opt(x))|
max{cost(Opt(x)), cost(A(x))
≤ εA(n)
are chosen to correspond to our intuition and to apply simultaneously to minimization
and maximization problems. Both of these bounds compare an approximation solution
with the optimal one, but in two diﬀerent ways.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 39/50
QUANTIFICATION of “ALMOST OPTIMAL SOLUTIONS” 2/2
For any δ > 1 we say that A is a δ-approximation
algorithm for P if, for every integer n, ρA(n) ≤ δ.
The ratio bound is never less than one. An optimal
algorithm has ratio bound 1. The larger is the best
possible ratio bound of an approximation algorithm, the
worse is the algorithm.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 40/50
Approximation algorithms for NP problems
Two general problems concerning approximation of NP-complete problems are of special
interest and importance.
The constant relative error bound problem: Given an NP-complete optimization problem
P with a cost of solution function c and an ε > 0, does there exist an approximation
polynomial time algorithm for P with the relative error bound ε?
The approximation scheme problem: Given an NP-complete problem P, does there exist
for P with a cost of solution function c a polynomial time algorithm for designing, given
an ε > 0 and an input instance x, an approximation for P and x with the relative error
bound ε?
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 41/50
Approximation thresholds
It is said that an algorithm A is an ε-approximation algorithm for an optimization
problem P if ε is its relative error bound.
The approximation threshold for P is the greatest lower bound of all ε > 0 such that
there is a polynomial time ε-approximation algorithm for P.
It can be shown that NP-complete problems can diﬀer very much with respect to their
approximation thresholds.
Note that if an optimization problem P has an approximation threshold 0, this means
that a (polynomial time) approximation arbitrarily close to the optimum is possible.
Note also that if P has approximation threshold 1, this means that no universal
(polynomial time) approximation method is possible.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 42/50
Examples
Example 1 The approximation threshold for the optimization version of the
KNAPSACK PROBLEM is 0.
Example 2 The approximation threshold for the VERTEX COVER PROBLEM is ≤ 1
2
.
Example 3 Unless P = NP, thee approximation threshold for the TRAVELING
SALESMAN PROBLEM is 1.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 43/50
WHY TO APPLY RANDOMIZATION in DISCRETE
OPTIMIZATIONS
One of the main goals in the area of discrete optimization is to improve the
approximation ratio. One tries to design randomized approximation algorithms that
produce feasible solutions whose cost (quality) is not very far from the optimal cost with
high probability. In the analysis of randomized algorithms we consider therefore the
approximation ratio as a random variable and the aim is then either
1 to estimate the expected value, E(Ratio), or
2 to guarantee that a certain approximation ratio is achieved with probability at least
1
2
.
These two diﬀerent aims lead to two ways randomized approximation algorithms are
deﬁned.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 44/50
Deﬁnition 1 Let P = (ΣI , ΣO , L, M, cost, goal) be an optimization problem. For any
δ > 1, a randomized algorithm A is called a randomized E[δ]-approximation algorithm
for P if
1 Prob(A(x) ∈ M(x)) = 1, and
2 E[RatioA(x) ≤ δ] ≥ 1
2
.
for every x ∈ L.
Deﬁnition Let P = (ΣI , ΣO , L, M, cost, goal) be an optimization problem.For any δ > 1,
a randomized algorithm A is called a randomized δ-approximation algorithm for P if
1 Prob(A(x) ∈ M(x)) = 1, and
2 Prob(RatioA(x) ≤ δ) ≥ 1
2
.
for every x ∈ L.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 45/50
Online algorithms
Another area in which randomization plays important role are online algorithms.
In practice the following problems are often of importance. One always obtains only a
part of the input that has to be processed immediately. Once this is done one obtains
another part of the input and has to process it again immediately, and so on - the input
can be inﬁnitely long. Such problems are called online problems and algorithms to solve
them are called online algorithms.
Example Scheduling problem - immediate assigning of resources for requests coming one
after another.
Key question. How good can an online algorithm (that does not know the future) be in
comparison to an algorithm that knows the whole input (the future) from the beginning?
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 46/50
Evaluation of online algorithms
Let P == (Σi , Σ0, L, M, cost, goal) be an optimization problem that can be viewed as an
online problem1
An algorithm A is an online algorithm for P if, for every input
x = x1x2 . . . xn ∈ L the following conditions are satisﬁed:
1 For all i ∈ {1, . . . , n} x1x2 . . . xi is a feasible input.
2 A(x) ∈ M(x), i.e. A always computes a feasible solution.
3 For all i ∈ {1, . . . , n}, A(x1x2 . . . xi ) is a part of A(x), i.e. the decisions made for the
preﬁx x1x2 . . . xi of x cannot be changed any more.
For every input x ∈ L, the competitive ratio compA(x) of A on x is the number
compA(x) = max
OptP (x)
costA(x)
,
costA(x)
OptP (x)
where OptP (x) denotes the cost of an optimal solution for the instance x of the problem
P.
Let δ ≥ 1. We say that A is a δ-competitive algorithm for P if compA(x) ≤ δ for all
x ∈ L.
Let δ ≥ 1 be a real. We say that an online problem P is δ-hard if there does not exist
any d-competitive online algorithm for P with d < δ.
1
An optimization problem can be viewed as an online problem when each preﬁx y of every input x can be
viewed also as a problem instance, and one is required to provide a solution for y that has to remain unchanged
as a part of the solution for the whole input x.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 47/50
RANDOMNESS EXTRACTORS
Extractors are algorithms that produce from any long and weakly-random bit-string a
shorter, but more random, bit-string.
In other words, an extractor is a mapping which, when applied to high-entropy source
generates a shorter yet uniformly distributed output.
In a more general approach an extractor is an algorithm that converts a long weakly
random source and a truly random short seed into a uniformly distributed random output
(that is longer than the seed and shorter than the source).
An extractor is a certain kind of pseudorandom generator.
No extractor is currently know that has been proven to work when applied to any type of
high-entropy source.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 48/50
von NEUMANN EXTRACTOR
It is an extractor that keeps taking successive pairs of consecutive bits (non-overlapping)
from the input stream and if the two bits are the same, no output is generated; if they
are diﬀerent the ﬁrst of them is outputted.
For example the input sequence
10111100001111000011100110010101
is transformed to the sequence
1101000
The von Neumann extractor can be shown to produce a uniform output, even if the
distribution of the input bits is not uniform, so long as each bit has the same probability
of being one and there is no correlation between successive bits.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 49/50
EXTRACTORS - FORMAL DEFINITION
Deﬁnition A (k, ε)-extractor Ext is a mapping
Ext : {0, 1}n
× {0, 1}d
→ {0, 1}m
such that for every distribution X on {0, 1}n
with H∞(X) ≥ k and the seed s, the
distribution Ext(X, s) is ε-close to the uniform distribution on {0, 1}m
.
The aim is to have n > m and d << m.
By a probabilistic method to be discussed later one can show that there exists a (k, ε)
extractor for many k and ε.
Note H∞ stands for so-called min-entropy, which is a measure of the amount of
randomness in the worst case.
IV054 1. 2. Types and Basic Design Methods for Randomized Algorithms 50/50