Dynamic Games of Complete Information
Repeated Games
Inﬁnitely Repeated Games
245
Inﬁnitely Repeated Games
Let G = ({1, 2}, (S1, S2) , (u1, u2)) be a strategic-form game of two
players.
An inﬁnitely repeated game Girep based on G proceeds in stages so
that in each stage, say t, players choose a strategy proﬁle
st
= (st
1
, st
2
).
Recall that a history of length t ≥ 0 is a sequence h = s1
· · · st
∈ St
of
t strategy proﬁles. Denote by H(t) the set of all histories of length t.
A pure strategy for player i in the inﬁnitely repeated game Girep is
a function
τi :
∞�
t=0
H(t) → Si
which for every possible history chooses a next step for player i.
Every pure strategy proﬁle τ = (τ1, τ2) in Girep induces a sequence of
pure strategy proﬁles wτ = s1
s2
· · · in G so that st
i
= τi(s1
· · · st−1
).
(Here for t = 0 we have that s1
· · · st−1
= �.)
246
Inﬁnitely Repeated Games & Discounted Payoff
Let τ = (τ1, τ2) be a pure strategy proﬁle in Girep such that
wτ = s1
s2
· · ·
Given 0 < δ < 1, we deﬁne a δ-discounted payoff by
uδ
i (τ) = (1 − δ)
∞�
t=0
δt
· ui(st+1
)
Given a strategic-form game G and 0 < δ < 1, we denote by Gδ
irep
the
inﬁnitely repeated game based on G together with the δ-discounted
payoffs.
247
Inﬁnitely Repeated Games & Discounted Payoff
Deﬁnition 78
A strategy proﬁle τ = (τ1, τ2) is a Nash equilibrium in Gδ
irep
if for both
i ∈ {1, 2} and for every τ�
i
we have that
uδ
i (τi, τ−i) ≥ uδ
i (τ�
i , τ−i)
Given a history h = s1
· · · st
and a strategy τi of player i, we deﬁne
a strategy τh
i
in the inﬁnitely repeated game Girep by
τh
i (¯s1
· · · ¯s
¯t
) = τi(s1
· · · st ¯s1
· · · ¯s
¯t
) for every sequence ¯s1
· · · ¯s
¯t
(i.e. τh
i
behaves as τi after h)
Now τ = (τ1, τ2) is a SPE in Gδ
irep
if for every history h we have that
(τh
1
, τh
2
) is a Nash equilibrium.
Note that (τh
1
, τh
2
) must be a NE also for all histories h that are not visited
when the proﬁle (τ1, τ2) is used.
248
Example
Consider the inﬁnitely repeated game Girep based on Prisoner’s
dilemma:
C S
C −5, −5 0, −20
S −20, 0 −1, −1
What are the Nash equilibria and SPE in Gδ
irep
for a given δ ?
Consider a pure strategy proﬁle (τ1, τ2) where τi(s1
· · · sT
) = C for all
T ≥ 1 and i ∈ {1, 2}. Is it a NE? A SPE?
Consider a "grim trigger" proﬁle (τ1, τ2) where
τi(s1
· · · sT
) =



S T = 0
S s�
= (S, S) for all 1 ≤ � ≤ T
C otherwise
Is it a NE? Is it a SPE?
249
One-Shot Deviation Principle
A pure strategy proﬁle τ = (τ1, τ2) in Girep satisﬁes one-shot deviation
property in Gδ
irep
if for every i ∈ {1, 2} and every ¯τi, differing from τi just
on a single history h, we have uδ
i
(¯τh
1
, τh
2
) ≤ uδ
i
(τh
1
, τh
2
).
Theorem 79
Let G = ({1, 2}, (S1, S2), (u1, u2)) be a two-player strategic-form game
such that both u1 and u2 are bounded on S = S1 × S2. Let 0 < δ < 1.
A pure strategy proﬁle τ = (τ1, τ2) in Girep is a SPE in Gδ
irep
iff
it satisﬁes the one-shot deviation property in Gδ
irep
.
Before proving Theorem 79, let us note the following:
� The one shot deviation property is concerned with all strategies
¯τi that differ from τi on a single history. This means that we have
to consider all histories h, even those that can not be visited
using τi with any opponent.
� The one-shot deviation property immediately implies
the following: If ¯τi does not differ from τi on any history of
the form h�
= hh��
where h��
� ε (i.e., on any history obtained by
prolonging h), then uδ
i
(¯τh
1
, τh
2
) ≤ uδ
i
(τh
1
, τh
2
).
Indeed, note that τh
i
differs from ¯τh
i
only on h.
250
One-Shot Deviation Principle
Proof. ⇒: Trivial.
⇐: Assume that τ satisﬁes the one-shot deviation property but is not
a SPE. That is, a deviation may increase payoff of one of the players
in a subgame. Assume, w.l.o.g., that player 1 gains by deviation to
a strategy ¯τ1 in a subgame starting with a h, i.e.,
uδ
1(¯τh
1, τh
2) > uδ
1(τh
1, τh
2) (29)
Since δ < 1 and ui are bounded on S, we may safely choose ¯τ1 so
that ¯τ1(h�
) = τ1(h�
) for all sufﬁciently long histories h�
.
Indeed, since ui is bounded on pure strategies of G, the sum
�∞
t=� δt
· ui(st+1
)
goes to 0 as � goes to ∞; hence the strict inequality (29) remains valid even if
¯τ1 is arbitrarily modiﬁed in a very distant future.
251
One-Shot Deviation Principle
Let h�
be a history of maximum length such that h is a preﬁx of h�
and
¯τ1(h�
) � τ1(h�
). (Note that then ¯τ1(h�
h��
) = τ1(h�
h��
) for all h��
� ε.)
Let ¯τ11 be a strategy of player 1 obtained from ¯τ1 by changing ¯τ1(h�
)
to τ1(h�
). Now note that the one-shot deviation property implies, that
uδ
1(¯τh�
11, τh�
2 ) = uδ
1(τh�
1 , τh�
2 ) ≥ uδ
1(¯τh�
1 , τh�
2 )
and thus uδ
1
(¯τh
11
, τh
2
) ≥ uδ
1
(¯τh
1
, τh
2
) > uδ
1
(τh
1
, τh
2
). Note that ¯τh
11
has
a strictly smaller number of deviations from τh
1
than ¯τh
1
.
Repeating the same argument with ¯τ11 in place of ¯τ1 we obtain ¯τ12
such that uδ
1
(¯τh
12
, τh
2
) ≥ uδ
1
(¯τh
11
, τh
2
) > uδ
1
(τh
1
, τh
2
). Here ¯τh
12
has even less
deviations from τh
1
than ¯τh
11
.
Then repeating with ¯τ12 in place of ¯τ1 we obtain ¯τ13 such that
uδ
1
(¯τh
13
, τh
2
) ≥ uδ
1
(¯τh
12
, τh
2
) > uδ
1
(τh
1
, τh
2
), etc., still decreasing the number
of deviations from τh
1
.
Eventually, as ¯τh
1
has only ﬁnitely many deviations from τh
1
, we get
¯τh
1k
= τh
1
for some k and thus uδ
1
(τh
1
, τh
2
) = uδ
1
(¯τh
1k
, τh
2
) > uδ
1
(τh
1
, τh
2
),
a contradiction. �
252
Example
Consider the inﬁnitely repeated game based on Prisoner’s dilemma:
C S
C −5, −5 0, −20
S −20, 0 −1, −1
The grim trigger proﬁle (τ1, τ2) where
τi(s1
· · · sT
) =



S T = 0
S s�
= (S, S) for all 1 ≤ � ≤ T
C otherwise
is a SPE.
253
A Simple Version of Folk Theorem
Let G = ({1, 2}, (S1, S2) , (u1, u2)) be a two-player strategic-form game
where u1, u2 are bounded on S = S1 × S2 (but S may be inﬁnite) and
let s∗
be a Nash equilibrium in G.
Let s be a strategy proﬁle in G satisfying ui(s) > ui(s∗
) for all i ∈ N.
Consider the following grim trigger for s using s∗
strategy proﬁle
τ = (τ1, τ2) in Girep where
τi(s1
· · · sT
) =



si T = 0
si s�
= s for all 1 ≤ � ≤ T
s∗
i
otherwise
Then for
δ ≥ max
i∈{1,2}
maxs�
i
∈Si
ui(s�
i
, s−i) − ui(s)
maxs�
i
∈Si
ui(s�
i
, s−i) − ui(s∗)
we have that (τ1, τ2) is a SPE in Gδ
irep
and uδ
i
(τ) = ui(s).
Proof: Consider a possible one-shot deviation ¯τ1 of player 1, i.e.,
there is exactly one h such that ¯τ1(h) � τ1(h). We distinguish two
cases depending on h. 254
Proof of Simple Folk Theorem (Cont.)
Case 1: h � s · · · s. Then there is a deviation from s in h and thus
according to (τh
1
, τh
2
) both players play s∗
forever :
uδ
1(τh
1, τh
2) = (1 − δ)
∞�
k=0
δk
u1(s∗
) = u1(s∗
)(1 − δ)
∞�
k=0
δk
= u1(s∗
)
Now (¯τh
1
, τh
2
) gives a sequence w(¯τh
1
,τh
2
) = (s�
1
, s∗
2
)s∗
s∗
· · · where s�
1
is
a strategy of player 1 to which he deviates after h.
Here player 2 plays s∗
2
all the time after h because one of the players has
already deviated in h.
We obtain
u1(¯τh
1, τh
2) = (1 − δ)

u1(s�
1, s∗
2) +
∞�
k=1
δk
u1(s∗
)


≤ (1 − δ)

u1(s∗
1, s∗
2) +
∞�
k=1
δk
u1(s∗
)


= u1(s∗
)
So this deviation cannot be beneﬁcial no matter what δ is.
255
Proof of Simple Folk Theorem (Cont.)
Case 2: h = s · · · s. Clearly, u1(τh
1
, τh
2
) = u1(s).
Now (¯τh
1
, τh
2
) gives a sequence w(¯τh
1
,τh
2
) = (s�
1
, s2)s∗
s∗
· · · where s�
1
is
a strategy of player 1 to which he deviates after h.
As opposed to the previous case, here player 2 ﬁrst plays s2 (since
the deviation of player 1 to s�
1
is the ﬁrst deviation in the history) and then
both players react by playing s∗
forever.
If u1(s�
1
, s2) < u1(s), then
uδ
1(¯τh
1, τh
2) = (1 − δ)

u1(s�
1, s2) +
∞�
k=1
δk
u1(s∗
)


< (1 − δ)

u1(s1, s2) +
∞�
k=1
δk
u1(s∗
)


< (1 − δ)

u1(s) +
∞�
k=1
δk
u1(s)

 = u1(s) = uδ
1(τh
1, τh
2)
and thus this deviation is also not beneﬁcial no matter what δ is.
256
Proof of Simple Folk Theorem (Cont.)
Finally, if u1(s�
1
, s2) ≥ u1(s), then
uδ
1(¯τh
1, τh
2) = (1 − δ)

u1(s�
1, s2) +
∞�
k=1
δk
u1(s∗
)


= (1 − δ)u1(s�
1, s2) + (1 − δ)u1(s∗
) · δ
∞�
k=0
δk
= u1(s�
1, s2) − δ · u1(s�
1, s2) + δ · u1(s∗
)
Thus
uδ
1
(¯τh
1
, τh
2
) ≤ uδ
1
(τh
1
, τh
2
) = u1(s) iff
u1(s�
1
, s2) − δ · u1(s�
1
, s2) + δ · u1(s∗
) ≤ u1(s) iff
u1(s�
1
, s2) − u1(s) ≤ δ · (u1(s�
1
, s2) − u1(s∗
)) iff
δ ≥
u1(s�
1
, s2) − u1(s)
u1(s�
1
, s2) − u1(s∗)
257
Proof of Simple Folk Theorem (Cont.)
Thus (τ1, τ2) satisﬁes the one-shot deviation property in Gδ
irep
w.r.t.
player 1 if
δ ≥
u1(s�
1
, s2) − u1(s)
u1(s�
1
, s2) − u1(s∗)
for all s�
1 ∈ S1 satisfying u1(s�
1, s2) ≥ u1(s)
Note that the right-hand-side expression is maximized when
u1(s�
1
, s2) is maximized and thus we get
δ ≥
maxs�
1
∈S1
u1(s�
1
, s2) − u1(s)
maxs�
1
∈S1
u1(s�
1
, s2) − u1(s∗)
Proving the same for player 2 and putting the results together, we
obtain that (τ1, τ2) satisﬁes the one-shot deviation property in Gδ
irep
if
δ ≥ max
i∈{1,2}
maxs�
i
∈Si
ui(s�
i
, s−i) − ui(s)
maxs�
i
∈Si
ui(s�
i
, s−i) − ui(s∗)
(30)
Thus by Theorem 79, (τ1, τ2) is a SPE in Gδ
irep
if δ satisﬁes ineq. (30).
258
Simple Folk Theorem – Example
Consider the inﬁnitely repeated game Girep based on the following
game G:
m f r
M 4, 4 −1, 5 3, 0
F 5, −1 1, 1 0, 0
R 0, 3 0, 0 2, 2
NE in G : (F, f)
Consider the grim trigger for (M, m) using (F, f), i.e., the proﬁle
(τ1, τ2) in Girep where
� τ1 : Plays M in a given stage if (M, m) was played in all previous
stages, and plays F otherwise.
� τ2 : Plays m in a given stage if (M, m) was played in all previous
stages, and plays f otherwise.
This is a SPE in Gδ
irep
for all δ ≥ 1
4 . Also, ui(τ1, τ2) = 4 for i ∈ {1, 2}.
Are there other SPE? Yes, a grim trigger for (R, r) using (F, f). This is
a SPE in Gδ
irep
for δ ≥ 1
2 .
259
Tacit Collusion
Consider the Cournot duopoly game model G = (N, (Si)i∈N , (ui)i∈N)
� N = {1, 2}
� Si = [0, κ]
� u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2
1
− q1q2
u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2
2
− q2q1
Assume for simplicity that c1 = c2 = c and denote θ = κ − c.
If the ﬁrms sign a binding contract to produce only θ/4, their proﬁt
would be θ2
/8 which is higher than the proﬁt θ2
/9 for playing the NE
(θ/3, θ/3).
However, such contracts are forbidden in many countries (including
US).
Is it still possible that the ﬁrms will behave selﬁshly (i.e. only
maximizing their proﬁts) and still obtain such payoffs?
In other words, is there a SPE in the inﬁnitely repeated game based
on G (with a discount factor δ) which gives the payoffs θ2
/8 ?
260
Tacit Collusion
Consider the Cournot duopoly game model G = (N, (Si)i∈N , (ui)i∈N)
� N = {1, 2}
� Si = [0, ∞)
� u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2
1
− q1q2
u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2
2
− q2q1
Assume for simplicity that c1 = c2 = c and denote θ = κ − c.
Consider the grim trigger proﬁle for (θ/4, θ/4) using (θ/3, θ/3) :
Player i will
� produce qi = θ/4 whenever all proﬁles in the history are
(θ/4, θ/4),
� whenever one of the players deviates, produce θ/3 from that
moment on.
Assuming that κ = 100 and c = 10 (which gives θ = 90), this is
a SPE Gδ
irep
for δ ≥ 0.5294 · · · . It results in (θ/4, θ/4)(θ/4, θ/4) · · ·
with the discounted payoffs θ2
/8.
261
Dynamic Games of Complete Information
Repeated Games
Inﬁnitely Repeated Games
Long-Run Average Payoff and Folk Theorems
262
Inﬁnitely Repeated Games & Average Payoff
In what follows we assume that all payoffs in the game G are
positive and that S is ﬁnite!
Let τ = (τ1, τ2) be a strategy proﬁle in the inﬁnitely repeated game
Girep such that wτ = s1
s2
· · · .
Deﬁnition 80
We deﬁne a long-run average payoff for player i by
u
avg
i
(τ) = lim sup
T→∞
1
T
T�
t=1
ui(st
)
(Here lim sup is necessary because τi may cause non-existence of the limit.)
The lon-run average payoff u
avg
i
(τ) is well-deﬁned if the limit
u
avg
i
(τ) = limT→∞
1
T
�T
t=1 ui(st
) exists.
Given a strategic-form game G, we denote by G
avg
irep
the inﬁnitely
repeated game based on G together with the long-run average
payoff.
263
Inﬁnitely Repeated Games & Average Payoff
Deﬁnition 81
A strategy proﬁle τ is a Nash equilibrium if u
avg
i
(τ) is well-deﬁned for
all i ∈ N, and for every i and every τ�
i
we have that
u
avg
i
(τi, τ−i) ≥ u
avg
i
(τ�
i , τ−i)
(Note that we demand existence of the deﬁning limit of u
avg
i
(τi, τ−i) but
the limit does not have to exist for u
avg
i
(τ�
i
, τ−i).)
Moreover, τ = (τ1, τ2) is a SPE in G
avg
irep
if for every history h we have
that (τh
1
, τh
2
) is a Nash equilibrium.
264
Example
Consider the inﬁnitely repeated game based on Prisoner’s dilemma:
C S
C −5, −5 0, −20
S −20, 0 −1, −1
The grim trigger proﬁle (τ1, τ2) where
τi(s1
· · · sT
) =



S T = 0
S s�
= (S, S) for all 1 ≤ � ≤ T
C otherwise
is a SPE which gives the long-run average payoff −1 to each player.
The intuition behind the grim trigger works as for the discounted payoff:
Whenever a player i deviates, the player −i starts playing C for which the best
response of player i is also C. So we obtain
(S, S) · · · (S, S)(X, Y)(C, C)(C, C) · · · (here (X, Y) is either (C, S) or (S, C)
depending on who deviates). Apparently, the long-run average payoff is −5
for both players, which is worse than −1.
265
Example
Consider the inﬁnitely repeated game based on Prisoner’s dilemma:
C S
C −5, −5 0, −20
S −20, 0 −1, −1
However, other payoffs can be supported by NE. Consider e.g.
a strategy proﬁle (τ1, τ2) such that
� Both players cyclically play as follows:
� 9 times (S, S)
� once (S, C)
� If one of the players deviates, then, from that moment on, both
play (C, C) forever.
Then (τ1, τ2) is also SPE.
Apparently, u
avg
1
(τ1, τ2) = 9
10 · (−1) + (−20)/10 = −29/10 and
u
avg
1
(τ1, τ2) = 9
10 (−1) = −9/10.
Player 2 gets better payoff than from the Pareto optimal proﬁle (S, S)!
266
Outline of the Folk Theorems
The previous examples suggest that other (possibly all?) convex
combinations of payoffs may be obtained by means of Nash
equilibria.
This observation forms a basis for a bunch of theorems, collectively
called Folk Theorems.
No author is listed since these theorems had been known in games
community long before they were formalized.
In what follows we prove several versions of Folk Theorem
concerning achievable payoffs for repeated games.
Ordered by increasing technical and conceptual difﬁculty, we consider
the following variants:
� Long-run average payoffs & SPE
� Discounted payoffs & SPE
� Long-run average payoffs & Nash equilibria
267
Folk Theorems – Feasible Payoffs
Deﬁnition 82
We say that a vector of payoffs v = (v1, v2) ∈ R2
is feasible if it is
a convex combination of payoffs for pure strategy proﬁles in G with
rational coefﬁcients, i.e., if there are rational numbers βs, here s ∈ S,
satisfying βs ≥ 0 and
�
s∈S βs = 1 such that for both i ∈ {1, 2} holds
vi =
�
s∈S
βs · ui(s)
We assume that there is m ∈ N such that each βs can be written in
the form βs = γs/m.
The following theorems can be extended to a notion of feasible payoffs using
arbitrary, possibly irrational, coefﬁcients βs in the convex combination.
Roughly speaking, this follows from the fact that each real number can be
approximated with rational numbers up to an arbitrary error. However,
the proofs are technically more involved.
268
Folk Theorems – Long-Run Average & SPE
Theorem 83
Let s∗
be a pure strategy Nash equilibrium in G and let v = (v1, v2) be
a feasible vector of payoffs satisfying vi ≥ ui(s∗
) for both i ∈ {1, 2}.
Then there is a strategy proﬁle τ = (τ1, τ2) in Girep such that
� τ is a SPE in G
avg
irep
� u
avg
i
(τ) = vi for i ∈ {1, 2}
Proof: Consider a strategy proﬁle τ = (τ1, τ2) in Girep which gives
the following behavior:
1. Unless one of the players deviates, the players play cyclically all
proﬁles s ∈ S so that each s is always played for γs rounds.
2. Whenever one of the players deviates, then, from that moment
on, each player i plays s∗
i
.
It is easy to see that u
avg
i
(τ) = vi.
We verify that τ is SPE.
269
Folk Theorems – Long-Run Average & SPE
Fix a history h, we show that τh
= (τh
1
, τh
2
) is a NE in G
avg
irep
.
� If h does not contain any deviation from the cyclic behavior 1.,
then τh
continues according to 1., thus u
avg
i
(τh
) = vi.
� If h contains a deviation from 1., then
wτh = s∗
s∗
· · ·
and thus u
avg
i
(τh
) = ui(s∗
).
� Now if a player i deviates to ¯τh
i
from τh
i
in G
avg
irep
, then
w(¯τh
i
,τh
−i
) = (s1
i , s�
−i)(s2
i , s∗
−i)(s3
i , s∗
−i) · · ·
where s1
i
, s2
i
, . . . are strategies of Si and s�
−i
is a strat. of S−i.
However, then u
avg
i
(¯τh
i
, τh
−i
) ≤ ui(s∗
) ≤ vi since s∗
is a Nash
equilibrium and thus ui(sk
i
, s∗
−i
) ≤ ui(s∗
) for all k ≥ 1.
Intuitively, player −i punishes player i by playing s∗
−i
. �
270
Folk Theorems – Discounted Payoffs & SPE
Theorem 84
Let s∗
be a pure strategy Nash equilibrium in G and let v = (v1, v2) be
a feasible payoff satisfying vi > ui(s∗
) for both i ∈ {1, 2}. Then there is
a strategy proﬁle τ = (τ1, τ2) in Girep and δ < 1 such that
� τ is a SPE in Gδ�
irep
for every δ�
∈ [δ, 1) and
� limδ�→1 uδ�
i
(τ) = vi.
Proof: The following claim allows us to reduce the discounted payoff
to the long-run-average.
Claim 5
Let τ be a well-deﬁned strategy proﬁle. Then
lim
δ→1−
uδ
i (τ) = u
avg
i
(τ)
Now to prove Theorem 84, consider the strategy proﬁle τ = (τ1, τ2) in
Girep from the proof of Theorem 83.
We check the one-shot deviation property in Gδ
irep
for δ close to 1. 271
Folk Theorems – Discounted Payoffs & SPE
Fix a history h and consider τh
= (τh
1
, τh
2
).
� If h does not contain any deviation from 1., then both players
follow 1., and uδ
i
(τh
) is close to u
avg
i
(τh
) = vi for δ close to 1.
� If h contains any deviation from 1., then wτh = s∗
s∗
· · · and
uδ
i
(τh
) = ui(s∗
).
� Now assume, w.l.o.g., that player 1 deviates exactly after h,
which gives a strategy ¯τh
1
differing from τh
1
only on h. Thus
w(¯τh
1
,τh
2
) = (s�
1
, s�
2
)s∗
s∗
· · · where s�
1
is a strategy of S1 and s�
2
is
either the next step in the cyclic behavior described by 1. (if h
follows 1.), or equal to s∗
2
(h does not follow 1.)
Note that for δ close to 1, we have that uδ
i
(¯τh
i
, τh
−i
) is close to
u
avg
i
(¯τh
i
, τh
−i
) = ui(s∗
).
� If h follows 1., then uδ
1
(τh
) is close to v1 which is greater
than u1(s∗
) to which uδ
1
(¯τh
1
, τh
2
) is close.
� If h does not follow 1., then s�
2
= s∗
2
(players punish due to
a deviation in h), and thus uδ
1
(¯τh
1
, τh
2
) ≤ u1(s∗
) = uδ
1
(τh
). �
272
Folk Theorems – Individually Rational Payoffs
Deﬁnition 85
v = (v1, v2) ∈ R2
is individually rational if for both i ∈ {1, 2} holds
vi ≥ min
s−i ∈S−i
max
si ∈Si
ui(si, s−i)
That is, vi is at least as large as the value that player i may secure by playing
best responses to the most hostile behavior of player −i.
Example:
m f r
M 4, 4 −1, 5 3, 0
F 5, −1 1, 1 0, 0
R 0, 3 0, 0 2, 2
Here any (v1, v2) such that v1 ≥ 2 and v2 ≥ 1 is individually
rational.
273
Folk Theorems – Long-Run Average & NE
Theorem 86
Let v = (v1, v2) be a feasible and individually rational vector of
payoffs. Then there is a strategy proﬁle τ = (τ1, τ2) in Girep such that
� τ is a Nash equilibrium in G
avg
irep
� u
avg
i
(τ) = vi for i ∈ {1, 2}
Proof: It sufﬁces to use a slightly modiﬁed strategy proﬁle τ = (τ1, τ2)
in Girep from Theorem 83:
� Unless one of the players deviates, the players play cyclically all
proﬁles s ∈ S so that each s is always played for γs rounds.
� Whenever a player i deviates, the opponent −i plays a strategy
smin
−i
∈ argmins−i ∈S−i
maxsi ∈Si
ui(si, s−i).
It is easy to see that u
avg
i
(τ) = vi.
If a player i deviates, then his long-run average payoff cannot be
higher than mins−i ∈S−i
maxsi ∈Si
ui(si, s−i) ≤ vi, so τ is a NE. �
274
Folk Theorems – Long-Run Average & NE
Theorem 87
If a strategy proﬁle τ = (τ1, τ2) is a NE in G
avg
irep
, then
�
u
avg
1
(τ), u
avg
2
(τ)
�
is individually rational.
Proof: Suppose that
�
u
avg
1
(τ), u
avg
2
(τ)
�
is not individually rational.
W.l.o.g. assume that u
avg
1
(τ) < mins2∈S2
maxs1∈S1
u1(s1, s2).
Now let us consider a new strategy ¯τ1 such that for an arbitrary
history h the pure strategy ¯τ1(h) is a best response to τ2(h).
But then, for every history h, we have
u1(¯τ1(h), τ2(h)) ≥ min
s2∈S2
max
s1∈S1
u1(s1, s2) > u
avg
1
(τ)
So clearly u
avg
1
(¯τ1, τ2) > u
avg
1
(τ) which contradicts the fact that (τ1, τ2)
is a NE. �
Note that if irrational convex combinations are allowed in the deﬁnition of
feasibility, then vectors of payoffs for Nash equilibria in G
avg
irep
are exactly
feasible and individually rational vectors of payoffs. Indeed, the coefﬁcients βs
in the deﬁnition of feasibility are exactly frequencies with which the individual
proﬁles of S are played in the NE. 275
Folk Theorems – Summary
� We have proved that "any reasonable" (i.e. feasible and
individually rational) vector of payoffs can be justiﬁed as payoffs
for a Nash equilibrium in G
avg
irep
(where the future has "an inﬁnite
weight").
� Concerning SPE, we have proved that any feasible vector of
payoffs dominating a Nash equilibrium in G can be justiﬁed as
payoffs for SPE in G
avg
irep
.
This result can be generalized to arbitrary feasible and strictly
individually rational payoffs by means of a more demanding
construction.
� For discounted payoffs, we have proved that an arbitrary feasible
vector of payoffs strictly dominating a Nash equilibrium in G can
be approximated using payoffs for SPE in Gδ
irep
as δ goes to 1.
Even this result can be extended to feasible and strictly individually
rational payoffs.
For a very detailed discussion of Folk Theorems see "A Course in
Game Theory" by M. J. Osborne and A. Rubinstein. 276