Strategic-Form Games – Conclusion
We have considered static games of complete information, i.e.,
"one-shot" games where the players know exactly what game they
are playing.
We modeled such games using strategic-form games.
We have considered both pure strategy setting and mixed strategy
setting.
In both cases, we considered four solution concepts:
� Strictly dominant strategies
� Iterative elimination of strictly dominated strategies
� Rationalizability (i.e., iterative elimination of strategies that are
never best responses)
� Nash equilibria
155
Strategic-Form Games – Conclusion
In pure strategy setting:
1. Strictly dominant strategy equilibrium survives IESDS,
rationalizability and is the unique Nash equilibrium (if it exists)
2. In ﬁnite games, rationalizable equilibria survive IESDS, IESDS
preserves the set of Nash equilibria
3. In ﬁnite games, rationalizability preserves Nash equilibria
In mixed setting:
1. In ﬁnite two player games, IESDS and rationalizability coincide.
2. Strictly dominant strategy equilibrium survives IESDS
(rationalizability) and is the unique Nash equilibrium (if it exists)
3. In ﬁnite games, IESDS (rationalizability) preserves Nash
equilibria
The proofs for 2. and 3. in the mixed setting are similar to corresponding
proofs in the pure setting.
156
Algorithms
� Strictly dominant strategy equilibria coincide in pure and mixed
settings, and can be computed in polynomial time.
� IESDS and rationalizability can be implemented in polynomial
time in the pure setting as well as in the mixed setting
In the mixed setting, linear programming is needed to implement one
step of IESDS (rationalizability).
� Nash equilibria can be computed for two-player games
� in polynomial time for zero-sum games
(using von Neumann’s theorem and linear programming)
� in exponential time using support enumeration
� in PPAD using Lemke-Howson
157
Complexity of Nash Eq. – FNP (Roughly)
Let R be a binary relation on words (over some alphabet) that is
polynomial-time computable and polynomially balanced.
I.e., membership to R is decidable in polynomial time, and (x, y) ∈ R implies
|y| ≤ |x|k
where k is independent of x, y.
A search problem associated with R is this: Given an input x, return
a y such that (x, y) ∈ R if such y exists, and return "NO" otherwise.
Note that the problem of computing NE can be seen as a search problem R
where (x, y) ∈ R means that x is a strategic-form game and y is a Nash
equilibrium of polynomial size. (We already know from support enumeration
that there is a NE of polynomial size.)
The class of all search problems is called FNP. A class FP ⊆ FNP
contains all search problem that can be solved in polynomial time.
A search problem determined by R is polynomially reducible to
a search problem R�
iff there exist polynomially computable functions
f, g such that
� if (x, y) ∈ R for some y, then (f(x), y�
) ∈ R�
for some y�
� if (f(x), y) ∈ R�
, then (x, g(y)) ∈ R
� if (f(x), y) � R�
for all y, then (x, y) � R for all y
158
Complexity of Nash Eq. – PPAD (Roughly)
The class PPAD is deﬁned by specifying one of its complete problems
(w.r.t. the polynomial time reduction) known as End-Of-The-Line:
� Input: Two Boolean circuits (with basis ∧, ∨, ¬) S and P, each
with m input bits and m output bits, such that
P(0m
) = 0m
� S(0m
).
� Problem: Find an input x ∈ {0, 1}m
such that P(S(x)) � x or
S(P(x)) � x � 0m
.
Intuition: End-Of-The-Line creates a directed graph HS,P with vertex set
{0, 1}m
and an edge from x to y whenever both y = S(x) ("successor") and
x = P(y) ("predecessor").
All vertices of HS,P have indegree and outdegree at most one. There is at
least one source (i.e., x satisfying P(x) = x, namely 0m
), so there is at least
one sink (i.e., x satisfying S(x) = x).
The goal is to ﬁnd either a source or a sink different from 0m
.
Theorem 53
The problem of computing Nash equilibria is complete for PPAD.
That is, Nash belongs to PPAD and End-Of-The-Line is polynomially
reducible to Nash.
159
Loose Ends – Modes of Dominance
Let σi, σ�
i
∈ Σi. Then σ�
i
is strictly dominated by σi if
ui(σi, σ−i) > ui(σ�
i
, σ−i) for all σ−i ∈ Σ−i.
Let σi, σ�
i
∈ Σi. Then σ�
i
is weakly dominated by σi if
ui(σi, σ−i) ≥ ui(σ�
i
, σ−i) for all σ−i ∈ Σ−i and there is σ�
−i
∈ Σ−i
such that ui(σi, σ�
−i
) > ui(σ�
i
, σ�
−i
).
Let σi, σ�
i
∈ Σi. Then σ�
i
is very weakly dominated by σi if
ui(σi, σ−i) ≥ ui(σ�
i
, σ−i) for all σ−i ∈ Σ−i.
A strategy is (strictly, weakly, very weakly) dominant in mixed
strategies if it (strictly, weakly, very weakly) dominates any other
mixed strategy.
Claim 4
Any mixed strategy proﬁle σ ∈ Σ such that each σi is very weakly
dominant in mixed strategies is a mixed Nash equilibrium.
The same claim can be proved in pure strategy setting.
160
Dynamic Games of Complete Information
Extensive-Form Games
Deﬁnition
Sub-Game Perfect Equilibria
161
Dynamic Games of Perfect Information
(Motivation)
Static games (modeled using strategic-form games) cannot capture
games that unfold over time.
In particular, as all players move simultaneously, there is no way how
to model situations in which order of moves is important.
Imagine e.g. chess where players take turns, in every round a player
knows all turns of the opponent before making his own turn.
There are many examples of dynamic games: markets that change
over time, political negotiations, models of computer systems, etc.
We model dynamic games using extensive-form games, a tree like
model that allows to express sequential nature of games.
We start with perfect information games, where each player always
knows results of all previous moves.
Then generalize to imperfect information, where players may have
only partial knowledge of these results (e.g. most card games).
162
Perfect-Info. Extensive-Form Games (Example)
1
h0
2
h1
(3, 1)
K
(1, 3)
U
L
2
h2
(2, 1)
K
(0, 0)
U
R
Here h0, h1, h2 are non-terminal nodes, leaves are terminal nodes.
Each non-terminal node is owned by a player who chooses an action.
E.g. h1 is owned by player 2 who chooses either K or U
Every action results in a transition to a new node.
Choosing L in h0 results in a move to h1
When a play reaches a terminal node, players collect payoffs.
E.g. the left most terminal node gives 3 to player 1 and 1 to player 2.
163
Perfect-Information Extensive-Form Games
A perfect-information extensive-form game is a tuple
G = (N, A, H, Z, χ, ρ, π, h0, u) where
� N = {1, . . . , n} is a set of n players, A is a (single) set of actions,
� H is a set of non-terminal (choice) nodes, Z is a set of terminal
nodes (assume Z ∩ H = ∅), denote H = H ∪ Z,
� χ : H →
�
2A
� {∅}
�
is the action function, which assigns to each
choice node a non-empty set of enabled actions,
� ρ : H → N is the player function, which assigns to each
non-terminal node a player i ∈ N who chooses an action there,
we deﬁne Hi := {h ∈ H | ρ(h) = i},
� π : H × A → H is the successor function, which maps
a non-terminal node and an action to a new node, such that
� h0 is the only node that is not in the image of π (the root)
� for all h1, h2 ∈ H and for all a1 ∈ χ(h1) and all a2 ∈ χ(h2),
if π(h1, a1) = π(h2, a2), then h1 = h2 and a1 = a2,
� u = (u1, . . . , un), where each ui : Z → R is a payoff function for
player i in the terminal nodes of Z.
164
Some Notation
A path from h ∈ H to h�
∈ H is a sequence h1a2h2a3h3 · · · hk−1ak hk
where h1 = h, hk = h�
and π(hj−1, aj) = hj for every 1 < j ≤ k.
Note that, in particular, h is a path from h to h.
Assumption: For every h ∈ H there is a unique path from h0 to h
and there is no inﬁnite path (i.e., a sequence h1a2h2a3h3 · · · such that
π(hj−1, aj) = hj for every j > 1).
Note that the assumption is satisﬁed when H is ﬁnite.
Indeed, uniqueness follows immediately from the deﬁnition of π. Now let X
be the set of all h�
from which there is a path to h. If h0 ∈ X we are done.
Otherwise, let h�
be a node of X with the longest path to h. As h�
� h0, there
is h��
and a ∈ χ(h��
) such that h�
= π(h��
, a). But then there is a path from h��
to h that is longer than the path from h�
, a contradiction.
The above claim implies that every perfect-information extensive-form
game can be seen as a game on a rooted tree (H, E, h0) where
� H ∪ Z is a set of nodes,
� E ⊆ H × H is a set of edges deﬁned by (h, h�
) ∈ E iff h ∈ H and
there is a ∈ χ(h) such that π(h, a) = h�
,
� h0 is the root. 165
Some More Notation
h� is a child of h, and h is a parent of h� if there is a ∈ χ(h) such
that h� = π(h, a).
h� ∈ H is reachable from h ∈ H if there is a path from h to h�.
If h�
is reachable from h we say that h�
is a descendant of h and h is
an ancestor of h�
(note that, by deﬁnition, h is both a descendant and
an ancestor of itself).
166
Example: Trust Game
1
h0
(5, 5)
z1
D
h1
2
(0, 20)
z2
K
(7.5, 12.5)
z3
S
T
� Two players, both start with 5$
� Player 1 either distrusts (D) player 2 and keeps the money
(payoffs (5, 5)), or trusts (T) player 2 and passes 5$ to player 2
� If player 1 chooses to trust player 2, the money is tripled by the
experimenter and sent to player 2.
� Player 2 may either keep (K) the additional 15$ (resulting in
(0, 20)), or share (S) it with player 1 (resulting in (7.5, 12.5)) 167
Example: Trust Game (Cont.)
1
h0
(5, 5)
z1
D
h1
2
(0, 20)
z2
K
(7.5, 12.5)
z3
S
T
� N = {1, 2}, A = {D, T, K, S}
� H = {h0, h1}, Z = {z1, z2, z3}
� χ(h0) = {D, T}, χ(h1) = {K, S}
� ρ(h0) = 1, ρ(h1) = 2
� π(h0, D) = z1, π(h0, T) = h1, π(h1, K) = z2, π(h1, S) = z3
� u1(z1) = 5, u1(z2) = 0, u1(z3) = 7.5, u2(z1) = 5, u2(z2) = 20,
u2(z3) = 12.5
168
Stackelberg Competition
Very similar to Cournot duopoly ...
� Two identical ﬁrms, players 1 and 2, produce some good.
Denote by q1 and q2 quantities produced by ﬁrms 1 and 2, resp.
� The total quantity of products in the market is q1 + q2.
� The price of each item is κ − q1 − q2 where κ > 0 is ﬁxed.
� Firms have a common per item production cost c.
Except that ...
� As opposed to Cournot duopoly, the ﬁrm 1 moves ﬁrst, and
chooses the quantity q1 ∈ [0, ∞).
� Afterwards, the ﬁrm 2 chooses q2 ∈ [0, ∞) (knowing q1) and then
the ﬁrms get their payoffs.
169
Stackelberg Competition – Extensive-Form Model
An extensive-form game model:
� N = {1, 2}
� A = [0, ∞)
� H = {h0, h
q1
1
| q1 ∈ [0, ∞)}
� Z = {zq1,q2 | q1, q2 ∈ [0, ∞)
� χ(h0) = [0, ∞), χ(h
q1
1
) = [0, ∞)
� ρ(h0) = 1, ρ(h
q1
1
) = 2
� π(h0, q1) = h
q1
1
, π(h
q1
1
, q2) = zq1,q2
� The payoffs are
� u1(zq1,q2
) = q1(κ − q1 − q2) − q1c
� u2(zq1,q2
) = q2(κ − q1 − q2) − q2c
170
Example: Chess (a bit simpliﬁed)
There are inﬁnitely many representations of chess, this one is different from
the one presented at the lecture.
� N = {1, 2}
� Denoting Boards the set of all (appropriately encoded) board
positions, we deﬁne H = B × {1, 2} where
B = {w ∈ Boards+
| no board repeats ≥ 3 times in w}
(Here Boards+
is the set of all non-empty sequences of boards)
� Z consists of all nodes (wb, i) (here b ∈ Boards) where either b
is checkmate for player i, or i does not have a move in b, or
every move of i in b leads to a board with two occurrences in w
� χ(wb, i) is the set of all legal moves of player i in b
� ρ(wb, i) = i
� π is deﬁned by π((wb, i), a) = (wbb�
, 2 − i + 1) where b�
is
obtained from b according to the move a
� h0 = (b0, 1) where b0 is the initial board
� uj(wb, i) ∈ {1, 0, −1}, here 1 means "win", 0 means "draw", and
−1 means "loss" for player j 171
Pure Strategies
Let G = (N, A, H, Z, χ, ρ, π, h0, u) be a perfect-information
extensive-form game.
Deﬁnition 54
A pure strategy of player i in G is a function si : Hi → A such
that for every h ∈ Hi we have that si(h) ∈ χ(h).
We denote by Si the set of all pure strategies of player i in G.
Denote by S = S1 × · · · × Sn the set of all pure strategy proﬁles.
Note that each pure strategy proﬁle s ∈ S determines a unique
path ws = h0a1h1 · · · hk−1ak hk from h0 to a terminal node hk by
aj = sρ(hj−1)(hj−1) ∀0 < j ≤ k
Denote by O(s) the terminal node reached by ws.
Abusing notation a bit, we denote by ui(s) the value ui(O(s)) of
the payoff for player i when the terminal node O(s) is reached
using strategies of s.
172
Example: Trust Game
1
h0
(5, 5)
z1
D
h1
2
(0, 20)
z2
K
(7.5, 12.5)
z3
S
T
A pure strategy proﬁle (s1, s2) where
s1(h0) = T and s2(h1) = K
is usually written as TK (BFS & left to right traversal) determines the
path h0T h1K z2
The resulting payoffs: u1(s1, s2) = 0 and u2(s1, s2) = 20.
173
Extensive-Form vs Strategic-Form
The extensive-form game G determines the corresponding
strategic-form game ¯G = (N, (Si)i∈N , (ui)i∈N)
Here note that the set of players N and the sets of pure strategies Si are the
same in G and in the corresponding game.
The payoff functions ui in ¯G are understood as functions on the pure strategy
proﬁles of S = S1 × · · · × Sn.
With this deﬁnition, we may apply all solution concepts and algorithms
developed for strategic-form games to the extensive form games.
We often consider the extensive-form to be only a different way of
representing the corresponding strategic-form game and do not distuinguish
between them.
There are some issues, namely whether all notions from
strategic-form area make sense in the extensive-form. Also, naive
application of algorithms may result in unnecessarily high complexity.
For now, let us consider pure strategies only!
174
Example: Trust Game
1
h0
(5, 5)
z1
D
h1
2
(0, 20)
z2
K
(7.5, 12.5)
z3
S
T
Is any strategy strictly (weakly, very weakly) dominant?
Is any strategy never best response?
Is there a Nash equilibrium in pure strategies ?
175
Example
1
h0
2
h1
(3, 1)
K
(1, 3)
U
L
2
h2
(2, 1)
K�
(0, 0)
U�
R
Find all pure strategies of both players.
Is any strategy (strictly, weakly, very weakly) dominant?
Is any strategy (strictly, weakly, very weakly) dominated?
Is any strategy never best response?
Are there Nash equilibria in pure strategies ?
176
Example
1
h0
2
h1
(3, 1)
K
(1, 3)
U
L
2
h2
(2, 1)
K�
(0, 0)
U�
R
KK�
KU�
UK�
UU�
L 3, 1 3, 1 1, 3 1, 3
R 2, 1 0, 0 2, 1 0, 0
Find all pure strategies of both players.
Is any strategy (strictly, weakly, very weakly) dominant?
Is any strategy (strictly, weakly, very weakly) dominated?
Is any strategy never best response?
Are there Nash equilibria in pure strategies ?
177
Criticism of Nash Equilibria
1
h0
2
h1
(3, 1)
K
(1, 3)
U
L
2
h2
(2, 1)
K�
(0, 0)
U�
R
KK� KU� UK� UU�
L 3, 1 3, 1 1, 3 1, 3
R 2, 1 0, 0 2, 1 0, 0
Two Nash equilibria in pure strategies: (L, UU�
) and (R, UK�
)
Examine (L, UU�
):
� Player 2 threats to play U�
in h2,
� as a result, player 1 plays L,
� player 2 reacts to L by playing the best response, i.e., U.
However, the threat is not credible, once a play reaches h2, a rational
player 2 chooses K�
. 178
Criticism of Nash Equilibria
1
h0
2
h1
(3, 1)
K
(1, 3)
U
L
2
h2
(2, 1)
K�
(0, 0)
U�
R
KK� KU� UK� UU�
L 3, 1 3, 1 1, 3 1, 3
R 2, 1 0, 0 2, 1 0, 0
Two Nash equilibria in pure strategies: (L, UU�
) and (R, UK�
)
Examine (R, UK�
): This equilibrium is sensible in the following sense:
� Player 2 plays the best response in both h1 and h2
� Player 1 plays the "best response" in h0 assuming that player 2
will play his best responses in the future.
This equilibrium is called subgame perfect.
179
Subgame Perfect Equilibria
Given h ∈ H, we denote by Hh
the set of all nodes reachable from h.
Deﬁnition 55 (Subgame)
A subgame Gh
of G rooted in h ∈ H is the restriction of G to nodes
reachable from h in the game tree. More precisely,
Gh
= (N, A, Hh
, Zh
, χh
, ρh
, πh
, h, uh
) where Hh
= H ∩ Hh
,
Zh
= Z ∩ Hh
, χh
and ρh
are restrictions of χ and ρ to Hh
, resp.,
(Given a function f : A → B and C ⊆ A, a restriction of f to C is a function
g : C → B such that g(x) = f(x) for all x ∈ C.)
� πh
is deﬁned for h�
∈ Hh
and a ∈ χh
(h�
) by πh
(h�
, a) = π(h�
, a)
� each uh
i
is a restriction of ui to Zh
Deﬁnition 56
A subgame perfect equilibrium (SPE) in pure strategies is a pure
strategy proﬁle s ∈ S such that for any subgame Gh
of G,
the restriction of s to Hh
is a Nash equilibrium in pure strategies in Gh
.
A restriction of s = (s1, . . . , sn) ∈ S to Hh
is a strategy proﬁle sh
= (sh
1
, . . . , sh
n )
where sh
i
(h�
) = si(h�
) for all i ∈ N and all h�
∈ Hi ∩ Hh
.
180
Stackelberg Competition – SPE
� N = {1, 2}, A = [0, ∞)
� H = {h0, hq1
1
| q1 ∈ [0, ∞)}, Z = {zq1,q2
| q1, q2 ∈ [0, ∞)
� χ(h0) = [0, ∞), χ(hq1
1
) = [0, ∞), ρ(h0) = 1, ρ(hq1
1
) = 2
� π(h0, q1) = hq1
1
, π(hq1
1
, q2) = zq1,q2
� The payoffs are u1(zq1,q2
) = q1(κ − c − q1 − q2),
u2(zq1,q2
) = q2(κ − c − q1 − q2)
Denote θ = κ − c
Player 1 chooses q1, we know that the best response of player 2 is
q2 = (θ − q1)/2 where θ = κ − c.
Then u1(zq1,q2
) = q1(θ − q1 − θ/2 − q1/2) = (θ/2)q1 − q2
1
/2 which is
maximized by q1 = θ/2, giving q2 = θ/4.
Then u1(zq1,q2
) = θ2
/8 and u2(zq1,q2
) = θ2
/16.
Note that ﬁrm 1 has an advantage as a leader.
181
Existence of SPE
From this moment on we consider only ﬁnite games!
Theorem 57
Every ﬁnite perfect-information extensive-form game has a SPE in
pure strategies.
Proof: By induction on the number of nodes.
Base case: If |H| = 1, the only node is terminal, and the trivial pure
strategy proﬁle is SPE.
Induction step: Consider a game with more than one node. Let
K = {h1, . . . , hk } be the set of all children of the root h0.
By induction, for every h� there is a SPE sh�
in Gh�
.
For every i ∈ N, deﬁne a strategy si of player i in G as follows:
� for i = ρ(h0) we have si(h0) ∈ argmaxh�∈K uh�
i
(sh�
)
� for all i ∈ N and h ∈ H we have si(h) = sh�
i
(h) where h ∈ Hh�
∩ Hi
We claim that s = (s1, . . . , sn) is a SPE in pure strategies.
By deﬁnition, s is NE in all subgames except (possibly) the G itself.
�
182
Existence of SPE (Cont.)
Let h� = sρ(h0)(h0).
Consider a possible deviation of player i.
Let ¯s be another pure strategy proﬁle in G obtained from
s = (s1, . . . , sn) by changing si.
First, assume that i � ρ(h0). Then
ui(s) = uh�
i
(sh�
) ≥ uh�
i
(¯sh�
) = ui(¯s)
Here the ﬁrst equality follows from h� = sρ(h0)(h0) and that s behaves similarly
as sh� in Gh� , the inequality follows from the fact that sh� is a NE in Gh� , and
the second equality follows from h� = sρ(h0)(h0) = ¯sρ(h0)(h0).
Second, assume that i = ρ(h0).
Let hr = ¯si(h0) = ¯sρ(h0)(h0).
Then uh�
i
(sh�
) ≥ uhr
i
(shr
) because h� maximizes the payoff of
player i = ρ(h0) in the children of h0.
But then
ui(s) = uh�
i
(sh�
) ≥ uhr
i
(shr
) ≥ uhr
i
(¯shr
) = ui(¯s)
183
Backward Induction
The proof of Theorem 57 gives an efﬁcient procedure for computing
SPE for ﬁnite perfect-information extensive-form games.
Backward Induction: We inductively "attach" to every node h a SPE
sh
in Gh
, together with a vector of expected payoffs
u(h) = (u1(h), . . . , un(h)).
� Initially: Attach to each terminal node z ∈ Z the empty proﬁle
sz
= (∅, . . . , ∅) and the payoff vector u(z) = (u1(z), . . . , un(z)).
� While(there is an unattached node h with all children attached):
1. Let K be the set of all children of h
2. Let
hmax ∈ argmax
h�∈K
uρ(h)(h�
)
3. Attach to h a SPE sh
where
� sh
ρ(h)
(h) = hmax
� for all i ∈ N and all h�
∈ Hi deﬁne sh
i
(h�
) = s
¯h
i
(h�
) where
h�
∈ H
¯h
∩ Hi (in G
¯h
, each sh
i
behaves as s
¯h
i
i.e.
�
sh
�¯h
= s
¯h
)
4. Attach to h the expected payoffs ui(h) = ui(hmax) for i ∈ N.
184
Chess
Recall that in the model of chess, the payoffs were from
{1, 0, −1} and u1 = −u2 (i.e. it is zero-sum).
By Theorem 57, there is a SPE in pure strategies (s∗
1
, s∗
2
).
However, then one of the following holds:
1. White has a winning strategy
If u1(s∗
1
, s∗
2
) = 1 and thus u2(s∗
1
, s∗
2
) = −1
2. Black has a winning strategy
If u1(s∗
1
, s∗
2
) = −1 and thus u2(s∗
1
, s∗
2
) = 1
3. Both players have strategies to force a draw
If u1(s∗
1
, s∗
2
) = 0 and thus u2(s∗
1
, s∗
2
) = 0
Question: Which one is the right answer?
Answer: Nobody knows yet ... the tree is too big!
Even with ∼ 200 depth & ∼ 5 moves per node: 5200 nodes!
185