IA168 Algorithmic Game Theory Tomáš Brázdil 1 Organization of This Course Sources: Lectures (slides, notes) based on several sources slides are prepared for lectures, some stuff on greenboard (⇒ attend the lectures) 2 Organization of This Course Sources: Lectures (slides, notes) based on several sources slides are prepared for lectures, some stuff on greenboard (⇒ attend the lectures) Books: Nisan/Roughgarden/Tardos/Vazirani, Algorithmic Game Theory, Cambridge University, 2007. Available online for free: http://www.cambridge.org/journals/nisan/downloads/Nisan_Non-printable.pdf Tadelis, Game Theory: An Introduction, Princeton University Press, 2013 (I use various resources, so please, attend the lectures) 2 Evaluation Oral exam Homework 4 times homework A "computer" game 3 What is Algorithmic Game Theory? First, what is the game theory? 4 What is Algorithmic Game Theory? First, what is the game theory? According to the Oxford dictionary it is "the branch of mathematics concerned with the analysis of strategies for dealing with competitive situations where the outcome of a participant’s choice of action depends critically on the actions of other participants" 4 What is Algorithmic Game Theory? First, what is the game theory? According to the Oxford dictionary it is "the branch of mathematics concerned with the analysis of strategies for dealing with competitive situations where the outcome of a participant’s choice of action depends critically on the actions of other participants" According to Myerson it is "the study of mathematical models of conflict and cooperation between intelligent rational decision-makers" 4 What is Algorithmic Game Theory? First, what is the game theory? According to the Oxford dictionary it is "the branch of mathematics concerned with the analysis of strategies for dealing with competitive situations where the outcome of a participant’s choice of action depends critically on the actions of other participants" According to Myerson it is "the study of mathematical models of conflict and cooperation between intelligent rational decision-makers" What does the "algorithmic" mean? 4 What is Algorithmic Game Theory? First, what is the game theory? According to the Oxford dictionary it is "the branch of mathematics concerned with the analysis of strategies for dealing with competitive situations where the outcome of a participant’s choice of action depends critically on the actions of other participants" According to Myerson it is "the study of mathematical models of conflict and cooperation between intelligent rational decision-makers" What does the "algorithmic" mean? It means that we are "concerned with the computational questions that arise in game theory, and that enlighten game theory. In particular, questions about finding efficient algorithms to ‘solve’ games.” Let’s have a look at some examples .... 4 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. 5 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. 5 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. The suspects are interrogated separately without any possibility of communication. 5 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. The suspects are interrogated separately without any possibility of communication. Each of the suspects is offered a deal: If he confesses (C) to the crime, he is free to go. The alternative is not to confess, that is remain silent (S). 5 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. The suspects are interrogated separately without any possibility of communication. Each of the suspects is offered a deal: If he confesses (C) to the crime, he is free to go. The alternative is not to confess, that is remain silent (S). Sentence depends on the behavior of both suspects. 5 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. The suspects are interrogated separately without any possibility of communication. Each of the suspects is offered a deal: If he confesses (C) to the crime, he is free to go. The alternative is not to confess, that is remain silent (S). Sentence depends on the behavior of both suspects. The problem: What would the suspects do? 5 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: 6 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. 6 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. If my colleague chooses S, then playing C gives me 0 and playing S gives −1. 6 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. If my colleague chooses S, then playing C gives me 0 and playing S gives −1. In both cases C is clearly better (it strictly dominates the other strategy). If the other suspect’s reasoning is the same, both choose C and get 5 years sentence. 6 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. If my colleague chooses S, then playing C gives me 0 and playing S gives −1. In both cases C is clearly better (it strictly dominates the other strategy). If the other suspect’s reasoning is the same, both choose C and get 5 years sentence. Where is the dilemma? There is a solution (S, S) which is better for both players but needs some “central” authority to control the players. 6 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. If my colleague chooses S, then playing C gives me 0 and playing S gives −1. In both cases C is clearly better (it strictly dominates the other strategy). If the other suspect’s reasoning is the same, both choose C and get 5 years sentence. Where is the dilemma? There is a solution (S, S) which is better for both players but needs some “central” authority to control the players. Are there always “dominant” strategies? 6 Nash equilibria – Battle of Sexes A couple agreed to meet this evening, but cannot recall if they will be attending the opera or a football match. 7 Nash equilibria – Battle of Sexes A couple agreed to meet this evening, but cannot recall if they will be attending the opera or a football match. The husband would like to go to the football game. The wife would like to go to the opera. Both would prefer to go to the same place rather than different ones. 7 Nash equilibria – Battle of Sexes A couple agreed to meet this evening, but cannot recall if they will be attending the opera or a football match. The husband would like to go to the football game. The wife would like to go to the opera. Both would prefer to go to the same place rather than different ones. If they cannot communicate, where should they go? 7 Nash equilibria – Battle of Sexes Battle of Sexes can be modeled as a game of two players (Wife, Husband) with the following payoffs: O F O 2, 1 0, 0 F 0, 0 1, 2 8 Nash equilibria – Battle of Sexes Battle of Sexes can be modeled as a game of two players (Wife, Husband) with the following payoffs: O F O 2, 1 0, 0 F 0, 0 1, 2 Apparently, no strategy of any player is dominant. A “solution”? 8 Nash equilibria – Battle of Sexes Battle of Sexes can be modeled as a game of two players (Wife, Husband) with the following payoffs: O F O 2, 1 0, 0 F 0, 0 1, 2 Apparently, no strategy of any player is dominant. A “solution”? Note that whenever both players play O, then neither of them wants to unilaterally deviate from his strategy! 8 Nash equilibria – Battle of Sexes Battle of Sexes can be modeled as a game of two players (Wife, Husband) with the following payoffs: O F O 2, 1 0, 0 F 0, 0 1, 2 Apparently, no strategy of any player is dominant. A “solution”? Note that whenever both players play O, then neither of them wants to unilaterally deviate from his strategy! (O, O) is an example of a Nash equilibrium (as is (F, F)) 8 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 9 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 This is an example of zero-sum games: whatever one of the players wins, the other one looses. 9 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 This is an example of zero-sum games: whatever one of the players wins, the other one looses. What is an optimal behavior here? Is there a Nash equilibrium? 9 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 This is an example of zero-sum games: whatever one of the players wins, the other one looses. What is an optimal behavior here? Is there a Nash equilibrium? Use mixed strategies: Each player plays each pure strategy with probability 1/3. The expected payoff of each player is 0 (even if one of the players changes his strategy, he still gets 0!). 9 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 This is an example of zero-sum games: whatever one of the players wins, the other one looses. What is an optimal behavior here? Is there a Nash equilibrium? Use mixed strategies: Each player plays each pure strategy with probability 1/3. The expected payoff of each player is 0 (even if one of the players changes his strategy, he still gets 0!). How to algorithmically solve games in mixed strategies? (we shall use probability theory and linear programming) 9 Philosophical Issues in Games 10 Dynamic Games So far we have seen games in strategic form that are unable to capture games that unfold over time (such as chess). 11 Dynamic Games So far we have seen games in strategic form that are unable to capture games that unfold over time (such as chess). For such purpose we need to use extensive form games: P1 P2 (1, 2) C (1, −1) D (0, 2) E A P2 (2, 2) F (1, 3) G B 11 Dynamic Games So far we have seen games in strategic form that are unable to capture games that unfold over time (such as chess). For such purpose we need to use extensive form games: P1 P2 (1, 2) C (1, −1) D (0, 2) E A P2 (2, 2) F (1, 3) G B How to "solve" such games? 11 Dynamic Games So far we have seen games in strategic form that are unable to capture games that unfold over time (such as chess). For such purpose we need to use extensive form games: P1 P2 (1, 2) C (1, −1) D (0, 2) E A P2 (2, 2) F (1, 3) G B How to "solve" such games? What is their relationship to the strategic form games? 11 Chance and Imperfect Information Some decisions in the game tree may be by chance and controlled by neither player (e.g. Poker, Backgammon, etc.) 12 Chance and Imperfect Information Some decisions in the game tree may be by chance and controlled by neither player (e.g. Poker, Backgammon, etc.) Sometimes a player may not be able to distinguish between several “positions” because he does not know all the information in them (Think a card game with opponent’s cards hidden). 12 Chance and Imperfect Information Some decisions in the game tree may be by chance and controlled by neither player (e.g. Poker, Backgammon, etc.) Sometimes a player may not be able to distinguish between several “positions” because he does not know all the information in them (Think a card game with opponent’s cards hidden). F G D 1 2 F G E1 2 A H I J B P1 P1 Nature P2 (a, b) (c, d) (e, f) (g, h) (i, j) (k, ) (m, n) 12 Chance and Imperfect Information Some decisions in the game tree may be by chance and controlled by neither player (e.g. Poker, Backgammon, etc.) Sometimes a player may not be able to distinguish between several “positions” because he does not know all the information in them (Think a card game with opponent’s cards hidden). F G D 1 2 F G E1 2 A H I J B P1 P1 Nature P2 (a, b) (c, d) (e, f) (g, h) (i, j) (k, ) (m, n) Again, how to solve such games? 12 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. 13 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. Example: Sealed Bid Auction Two bidders are trying to purchase the same item. 13 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. Example: Sealed Bid Auction Two bidders are trying to purchase the same item. The bidders simultaneously submit bids b1 and b2 and the item is sold to the highest bidder at his bid price (first price auction) 13 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. Example: Sealed Bid Auction Two bidders are trying to purchase the same item. The bidders simultaneously submit bids b1 and b2 and the item is sold to the highest bidder at his bid price (first price auction) The payoff of the player 1 (and similarly for player 2) is calculated by u1(b1, b2) =    v1 − b1 b1 > b2 1 2 (v1 − b1) b1 = b2 0 b1 < b2 Here v1 is the private value that player 1 assigns to the item and so the player 2 does not know u1. 13 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. Example: Sealed Bid Auction Two bidders are trying to purchase the same item. The bidders simultaneously submit bids b1 and b2 and the item is sold to the highest bidder at his bid price (first price auction) The payoff of the player 1 (and similarly for player 2) is calculated by u1(b1, b2) =    v1 − b1 b1 > b2 1 2 (v1 − b1) b1 = b2 0 b1 < b2 Here v1 is the private value that player 1 assigns to the item and so the player 2 does not know u1. How to deal with such a game? Assume the “worst” private value? What if we have a partial knowledge about the private values? 13 Inefficiency of Equilibria In Prisoner’s Dilemma, the selfish behavior of suspects (the Nash equilibrium) results in somewhat worse than ideal situation. C S C −5, −5 0, −20 S −20, 0 −1, −1 14 Inefficiency of Equilibria In Prisoner’s Dilemma, the selfish behavior of suspects (the Nash equilibrium) results in somewhat worse than ideal situation. C S C −5, −5 0, −20 S −20, 0 −1, −1 Defining a welfare function W which to every pair of strategies assigns the sum of payoffs, we get W(C, C) = −10 but W(S, S) = −2. 14 Inefficiency of Equilibria In Prisoner’s Dilemma, the selfish behavior of suspects (the Nash equilibrium) results in somewhat worse than ideal situation. C S C −5, −5 0, −20 S −20, 0 −1, −1 Defining a welfare function W which to every pair of strategies assigns the sum of payoffs, we get W(C, C) = −10 but W(S, S) = −2. The ratio W(C,C) W(S,S) = 5 measures the inefficiency of "selfish-behavior" (C, C) w.r.t. the optimal “centralized” solution. 14 Inefficiency of Equilibria In Prisoner’s Dilemma, the selfish behavior of suspects (the Nash equilibrium) results in somewhat worse than ideal situation. C S C −5, −5 0, −20 S −20, 0 −1, −1 Defining a welfare function W which to every pair of strategies assigns the sum of payoffs, we get W(C, C) = −10 but W(S, S) = −2. The ratio W(C,C) W(S,S) = 5 measures the inefficiency of "selfish-behavior" (C, C) w.r.t. the optimal “centralized” solution. Price of Anarchy is the maximum ratio between values of equilibria and the value of an optimal solution. 14 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: 15 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: “Centralized”: A central authority tells each agent where to go. 15 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: “Centralized”: A central authority tells each agent where to go. “Decentralized”: Each agent selfishly minimizes his travel time. 15 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: “Centralized”: A central authority tells each agent where to go. “Decentralized”: Each agent selfishly minimizes his travel time. Price of Anarchy measure the ratio between average travel time in these two cases. 15 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: “Centralized”: A central authority tells each agent where to go. “Decentralized”: Each agent selfishly minimizes his travel time. Price of Anarchy measure the ratio between average travel time in these two cases. Problem: Bound the price of anarchy over all routing games? 15 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. Games in Algorithms: several game theoretic problems have a very interesting algorithmic status and are solved by interesting algorithms 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. Games in Algorithms: several game theoretic problems have a very interesting algorithmic status and are solved by interesting algorithms Games in modeling and analysis of reactive systems: program inputs viewed “adversarially”, bisimulation games, etc. 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. Games in Algorithms: several game theoretic problems have a very interesting algorithmic status and are solved by interesting algorithms Games in modeling and analysis of reactive systems: program inputs viewed “adversarially”, bisimulation games, etc. Games in computational complexity: Many complexity classes are definable in terms of games: PSPACE, polynomial hierarchy, etc. 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. Games in Algorithms: several game theoretic problems have a very interesting algorithmic status and are solved by interesting algorithms Games in modeling and analysis of reactive systems: program inputs viewed “adversarially”, bisimulation games, etc. Games in computational complexity: Many complexity classes are definable in terms of games: PSPACE, polynomial hierarchy, etc. Games in Logic: modal and temporal logics, Ehrenfeucht-Fraisse games, etc. 16 Games in Computer Science Games, the Internet and E-commerce: An extremely active research area at the intersection of CS and Economics Basic idea: “The internet is a HUGE experiment in interaction between agents (both human and automated)” How do we set up the rules of this game to harness “socially optimal” results? 17 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms. 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms. Then we consider repeated games which allow players to learn from history and/or to react to deviations of the other players. 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms. Then we consider repeated games which allow players to learn from history and/or to react to deviations of the other players. Subsequently, we move on to incomplete information games and auctions. 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms. Then we consider repeated games which allow players to learn from history and/or to react to deviations of the other players. Subsequently, we move on to incomplete information games and auctions. Finally, we consider (in)efficiency of equilibria (such as the Price of Anarchy) and its properties on important classes of routing and network formation games. 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms. Then we consider repeated games which allow players to learn from history and/or to react to deviations of the other players. Subsequently, we move on to incomplete information games and auctions. Finally, we consider (in)efficiency of equilibria (such as the Price of Anarchy) and its properties on important classes of routing and network formation games. Remaining time will be devoted to selected topics from extensive form games, games on graphs etc. 18 Static Games of Complete Information Strategic-Form Games Solution concepts 19 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 20 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 2. Conditional on the players’ strategies, payoffs are distributed to all players. 20 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 2. Conditional on the players’ strategies, payoffs are distributed to all players. Complete information means that the following is common knowledge among players: all possible strategies of all players, what payoff is assigned to each combination of strategies. 20 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 2. Conditional on the players’ strategies, payoffs are distributed to all players. Complete information means that the following is common knowledge among players: all possible strategies of all players, what payoff is assigned to each combination of strategies. Definition 1 A fact E is a common knowledge among players {1, . . . , n} if for every sequence i1, . . . , ik ∈ {1, . . . , n} we have that i1 knows that i2 knows that ... ik−1 knows that ik knows E. 20 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 2. Conditional on the players’ strategies, payoffs are distributed to all players. Complete information means that the following is common knowledge among players: all possible strategies of all players, what payoff is assigned to each combination of strategies. Definition 1 A fact E is a common knowledge among players {1, . . . , n} if for every sequence i1, . . . , ik ∈ {1, . . . , n} we have that i1 knows that i2 knows that ... ik−1 knows that ik knows E. The goal of each player is to maximize his payoff (and this fact is common knowledge). 20 Strategic-Form Games To formally represent static games of complete information we define strategic-form games. Definition 2 A game in strategic-form (or normal-form) is an ordered triple G = (N, (Si)i∈N , (ui)i∈N), in which: N = {1, 2, . . . , n} is a finite set of players. Si is a set of (pure) strategies of player i, for every i ∈ N. A strategy profile is a vector of strategies of all players (s1, . . . , sn) ∈ S1 × · · · × Sn. We denote the set of all strategy profiles by S = S1 × · · · × Sn. ui : S → R is a function associating each strategy profile s = (s1, . . . , sn) ∈ S with the payoff ui(s) to player i, for every player i ∈ N. 21 Strategic-Form Games To formally represent static games of complete information we define strategic-form games. Definition 2 A game in strategic-form (or normal-form) is an ordered triple G = (N, (Si)i∈N , (ui)i∈N), in which: N = {1, 2, . . . , n} is a finite set of players. Si is a set of (pure) strategies of player i, for every i ∈ N. A strategy profile is a vector of strategies of all players (s1, . . . , sn) ∈ S1 × · · · × Sn. We denote the set of all strategy profiles by S = S1 × · · · × Sn. ui : S → R is a function associating each strategy profile s = (s1, . . . , sn) ∈ S with the payoff ui(s) to player i, for every player i ∈ N. Definition 3 A zero-sum game G is one in which for all s = (s1, . . . , sn) ∈ S we have u1(s) + u2(s) + · · · + un(s) = 0. 21 Example: Prisoner’s Dilemma N = {1, 2} S1 = S2 = {S, C} u1, u2 are defined as follows: u1(C, C) = −5, u1(C, S) = 0, u1(S, C) = −20, u1(S, S) = −1 u2(C, C) = −5, u2(C, S) = −20, u2(S, C) = 0, u2(S, S) = −1 (Is it zero sum?) 22 Example: Prisoner’s Dilemma N = {1, 2} S1 = S2 = {S, C} u1, u2 are defined as follows: u1(C, C) = −5, u1(C, S) = 0, u1(S, C) = −20, u1(S, S) = −1 u2(C, C) = −5, u2(C, S) = −20, u2(S, C) = 0, u2(S, S) = −1 (Is it zero sum?) We usually write payoffs in the following form: C S C −5, −5 0, −20 S −20, 0 −1, −1 or as two matrices: C S C −5 0 S −20 −1 C S C −5 −20 S 0 −1 22 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) Firms 1 and 2 have per item production costs c1 and c2, resp. 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) Firms 1 and 2 have per item production costs c1 and c2, resp. Question: How these firms are going to behave? 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) Firms 1 and 2 have per item production costs c1 and c2, resp. Question: How these firms are going to behave? We may model the situation using a strategic-form game. 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) Firms 1 and 2 have per item production costs c1 and c2, resp. Question: How these firms are going to behave? We may model the situation using a strategic-form game. Strategic-form game model (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 u2(q1, q2) = q2(κ − q1 − q2) − q2c2 23 Solution Concepts A solution concept is a method of analyzing games with the objective of restricting the set of all possible outcomes to those that are more reasonable than others. 24 Solution Concepts A solution concept is a method of analyzing games with the objective of restricting the set of all possible outcomes to those that are more reasonable than others. We will use term equilibrium for any one of the strategy profiles that emerges as one of the solution concepts’ predictions. (I follow the approach of Steven Tadelis here, it is not completely standard) 24 Solution Concepts A solution concept is a method of analyzing games with the objective of restricting the set of all possible outcomes to those that are more reasonable than others. We will use term equilibrium for any one of the strategy profiles that emerges as one of the solution concepts’ predictions. (I follow the approach of Steven Tadelis here, it is not completely standard) Example 4 Nash equilibrium is a solution concept. That is, we “solve” games by finding Nash equilibria and declare them to be reasonable outcomes. 24 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 25 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 2. Players are intelligent: An intelligent player knows everything about the game (actions and payoffs) and can make any inferences about the situation that we can make. 25 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 2. Players are intelligent: An intelligent player knows everything about the game (actions and payoffs) and can make any inferences about the situation that we can make. 3. Common knowledge: The fact that players are rational and intelligent is a common knowledge among them. 25 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 2. Players are intelligent: An intelligent player knows everything about the game (actions and payoffs) and can make any inferences about the situation that we can make. 3. Common knowledge: The fact that players are rational and intelligent is a common knowledge among them. 4. Self-enforcement: Any prediction (or equilibrium) of a solution concept must be self-enforcing. 25 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 2. Players are intelligent: An intelligent player knows everything about the game (actions and payoffs) and can make any inferences about the situation that we can make. 3. Common knowledge: The fact that players are rational and intelligent is a common knowledge among them. 4. Self-enforcement: Any prediction (or equilibrium) of a solution concept must be self-enforcing. Here 4. implies non-cooperative game theory: Each player is in control of his actions, and he will stick to an action only if he finds it to be in his best interest. 25 Evaluating Solution Concepts In order to evaluate our theory as a methodological tool we use the following criteria: 26 Evaluating Solution Concepts In order to evaluate our theory as a methodological tool we use the following criteria: 1. Existence (i.e. How often does it apply?): Solution concept should apply to a wide variety of games. E.g. We prove that mixed Nash equilibria exist in all two player finite strategic-form games. 26 Evaluating Solution Concepts In order to evaluate our theory as a methodological tool we use the following criteria: 1. Existence (i.e. How often does it apply?): Solution concept should apply to a wide variety of games. E.g. We prove that mixed Nash equilibria exist in all two player finite strategic-form games. 2. Uniqueness (How much does it restrict behavior?): We demand our solution concept to restrict the behavior as much as possible. E.g. So called strictly dominant strategy equilibria are always unique as opposed to Nash eq. 26 Evaluating Solution Concepts In order to evaluate our theory as a methodological tool we use the following criteria: 1. Existence (i.e. How often does it apply?): Solution concept should apply to a wide variety of games. E.g. We prove that mixed Nash equilibria exist in all two player finite strategic-form games. 2. Uniqueness (How much does it restrict behavior?): We demand our solution concept to restrict the behavior as much as possible. E.g. So called strictly dominant strategy equilibria are always unique as opposed to Nash eq. The basic notion for evaluating "social outcome" is the following Definition 5 A strategy profile s ∈ S Pareto dominates a strategy profile s ∈ S if ui(s) ≥ ui(s ) for all i ∈ N, and ui(s) > ui(s ) for at least one i ∈ N. A strategy profile s ∈ S is Pareto optimal if it is not Pareto dominated by any other strategy profile. We will see more measures of social outcome later. 26 Solution Concepts – Pure Strategies We will consider the following solution concepts: strict dominant strategy equilibrium iterated elimination of strictly dominated strategies (IESDS) rationalizability Nash equilibria 27 Solution Concepts – Pure Strategies We will consider the following solution concepts: strict dominant strategy equilibrium iterated elimination of strictly dominated strategies (IESDS) rationalizability Nash equilibria For now, let us concentrate on pure strategies only! I.e., no mixed strategies are allowed. We will generalize to mixed setting later. 27 Notation Let N = {1, . . . , n} be a finite set and for each i ∈ N let Xi be a set. Let X := i∈N Xi = {(x1, . . . , xn) | xj ∈ Xj, j ∈ N}. For i ∈ N we define X−i := j i Xj, i.e., X−i = {(x1, . . . , xi−1, xi+1, . . . , xn) | xj ∈ Xj, ∀j i} An element of X−i will be denoted by x−i = (x1, . . . , xi−1, xi+1, . . . , xn) We slightly abuse notation and write (xi, x−i) to denote (x1, . . . , xi, . . . , xn) ∈ X. 28 Strict Dominance in Pure Strategies Definition 6 Let si, si ∈ Si be strategies of player i. Then si is strictly dominated by si (write si si ) if for any possible combination of the other players’ strategies, s−i ∈ S−i, we have ui(si, s−i) > ui(si , s−i) for all s−i ∈ S−i 29 Strict Dominance in Pure Strategies Definition 6 Let si, si ∈ Si be strategies of player i. Then si is strictly dominated by si (write si si ) if for any possible combination of the other players’ strategies, s−i ∈ S−i, we have ui(si, s−i) > ui(si , s−i) for all s−i ∈ S−i Claim 1 An intelligent and rational player will never play a strictly dominated strategy. Clearly, intelligence implies that the player should recognize dominated strategies, rationality implies that the player will avoid playing them. 29 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. Claim 2 Any rational player will play the strictly dominant strategy (if it exists). 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. Claim 2 Any rational player will play the strictly dominant strategy (if it exists). Definition 8 A strategy profile s ∈ S is a strictly dominant strategy equilibrium if si ∈ Si is strictly dominant for all i ∈ N. 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. Claim 2 Any rational player will play the strictly dominant strategy (if it exists). Definition 8 A strategy profile s ∈ S is a strictly dominant strategy equilibrium if si ∈ Si is strictly dominant for all i ∈ N. Corollary 9 If the strictly dominant strategy equilibrium exists, it is unique and rational players will play it. 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. Claim 2 Any rational player will play the strictly dominant strategy (if it exists). Definition 8 A strategy profile s ∈ S is a strictly dominant strategy equilibrium if si ∈ Si is strictly dominant for all i ∈ N. Corollary 9 If the strictly dominant strategy equilibrium exists, it is unique and rational players will play it. Is the strictly dominant strategy equilibrium always Pareto optimal? 30 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 31 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the strictly dominant strategy equilibrium (the only profile that is not Pareto optimal!). 31 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the strictly dominant strategy equilibrium (the only profile that is not Pareto optimal!). In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 31 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the strictly dominant strategy equilibrium (the only profile that is not Pareto optimal!). In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 no strictly dominant strategies exist. 31 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the strictly dominant strategy equilibrium (the only profile that is not Pareto optimal!). In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 no strictly dominant strategies exist. 31 Indiana Jones and the Last Crusade (Taken from Dixit & Nalebuff’s "The Art of Strategy" and a lecture of Robert Marks) Indiana Jones, his father, and the Nazis have all converged at the site of the Holy Grail. The two Joneses refuse to help the Nazis reach the last step. So the Nazis shoot Indiana’s dad. Only the healing power of the Holy Grail can save the senior Dr. Jones from his mortal wound. Suitably motivated, Indiana leads the way to the Holy Grail. But there is one final challenge. He must choose between literally scores of chalices, only one of which is the cup of Christ. While the right cup brings eternal life, the wrong choice is fatal. The Nazi leader impatiently chooses a beautiful gold chalice, drinks the holy water, and dies from the sudden death that follows from the wrong choice. Indiana picks a wooden chalice, the cup of a carpenter. Exclaiming "There’s only one way to find out" he dips the chalice into the font and drinks what he hopes is the cup of life. Upon discovering that he has chosen wisely, Indiana brings the cup to his father and the water heals the mortal wound. 32 Indiana Jones and the Last Crusade (cont.) Indy Goofed Although this scene adds excitement, it is somewhat embarrassing that such a distinguished professor as Dr. Indiana Jones would overlook his dominant strategy. He should have given the water to his father without testing it first. If Indiana has chosen the right cup, his father is still saved. If Indiana has chosen the wrong cup, then his father dies but Indiana is spared. Testing the cup before giving it to his father doesn’t help, since if Indiana has made the wrong choice, there is no second chance – Indiana dies from the water and his father dies from the wound. 33 Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. 34 Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents will not play strictly dominated strategies and thus all opponents know that effectively they are facing a "smaller" game. 34 Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents will not play strictly dominated strategies and thus all opponents know that effectively they are facing a "smaller" game. As rationality is a common knowledge, everyone knows that everyone knows that the game is effectively smaller. 34 Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents will not play strictly dominated strategies and thus all opponents know that effectively they are facing a "smaller" game. As rationality is a common knowledge, everyone knows that everyone knows that the game is effectively smaller. Thus everyone knows, that nobody will play strictly dominated strategies in the smaller game (and such strategies may indeed exist). 34 Iterated Strict Dominance in Pure Strategies We know that no rational player ever plays strictly dominated strategies. As each player knows that each player is rational, each player knows that his opponents will not play strictly dominated strategies and thus all opponents know that effectively they are facing a "smaller" game. As rationality is a common knowledge, everyone knows that everyone knows that the game is effectively smaller. Thus everyone knows, that nobody will play strictly dominated strategies in the smaller game (and such strategies may indeed exist). Because it is a common knowledge that all players will perform this kind of reasoning again, the process can continue until no more strictly dominated strategies can be eliminated. 34 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . Definition 10 A strategy profile s = (s1, . . . , sn) ∈ S is an IESDS equilibrium if each si survives IESDS. 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . Definition 10 A strategy profile s = (s1, . . . , sn) ∈ S is an IESDS equilibrium if each si survives IESDS. A game is IESDS solvable if it has a unique IESDS equilibrium. 35 IESDS The previous reasoning yields the Iterated Elimination of Strictly Dominated Strategies (IESDS): Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS . 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . Definition 10 A strategy profile s = (s1, . . . , sn) ∈ S is an IESDS equilibrium if each si survives IESDS. A game is IESDS solvable if it has a unique IESDS equilibrium. Remark: If all Si are finite, then in 2. we may remove only some of the strictly dominated strategies (not necessarily all). The result is not affected by the order of elimination since strictly dominated strategies remain strictly dominated even after removing some other strictly dominated strategies. 35 IESDS Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 36 IESDS Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only one surviving the first round of IESDS. 36 IESDS Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only one surviving the first round of IESDS. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 36 IESDS Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only one surviving the first round of IESDS. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 all strategies survive all rounds (i.e. IESDS ≡ anything may happen, sorry) 36 A Bit More Interesting Example L C R L 4, 3 5, 1 6, 2 C 2, 1 8, 4 3, 6 R 3, 0 9, 6 2, 8 IESDS on greenboard! 37 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) N = {1, 2} 38 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) N = {1, 2} Si = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (political and ideological spectrum) 38 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) N = {1, 2} Si = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (political and ideological spectrum) 10 voters belong to each position (Here 10 means ten percent in the real-world) 38 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) N = {1, 2} Si = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (political and ideological spectrum) 10 voters belong to each position (Here 10 means ten percent in the real-world) Voters vote for the closest candidate. If there is a tie, then 1 2 got to each candidate 38 Political Science Example: Median Voter Theorem Hotelling (1929) and Downs (1957) N = {1, 2} Si = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (political and ideological spectrum) 10 voters belong to each position (Here 10 means ten percent in the real-world) Voters vote for the closest candidate. If there is a tie, then 1 2 got to each candidate Payoff: The number of voters for the candidate, each candidate (selfishly) strives to maximize this number 38 Political Science Example: Median Voter Theorem 39 Political Science Example: Median Voter Theorem 1 and 10 are the (only) strictly dominated strategies ⇒ D1 1 = D1 2 = {2, . . . , 9} 39 Political Science Example: Median Voter Theorem 1 and 10 are the (only) strictly dominated strategies ⇒ D1 1 = D1 2 = {2, . . . , 9} in G1 DS , 2 and 9 are the (only) strictly dominated strategies ⇒ D2 1 = D2 2 = {3, . . . , 8} 39 Political Science Example: Median Voter Theorem 1 and 10 are the (only) strictly dominated strategies ⇒ D1 1 = D1 2 = {2, . . . , 9} in G1 DS , 2 and 9 are the (only) strictly dominated strategies ⇒ D2 1 = D2 2 = {3, . . . , 8} . . . only 5, 6 survive IESDS 39 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: Imagine that your colleague did something stupid 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: Imagine that your colleague did something stupid What would you ask him? Usually something like "What were you thinking?" 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: Imagine that your colleague did something stupid What would you ask him? Usually something like "What were you thinking?" The colleague may respond with a reasonable description of his belief in which his action was (one of) the best he could do (You may of course question reasonableness of the belief) 40 Belief & Best Response IESDS eliminated apparently unreasonable behavior (leaving "reasonable" behavior implicitly untouched). What if we rather want to actively preserve reasonable behavior? What is reasonable? .... what we believe is reasonable :-). Intuition: Imagine that your colleague did something stupid What would you ask him? Usually something like "What were you thinking?" The colleague may respond with a reasonable description of his belief in which his action was (one of) the best he could do (You may of course question reasonableness of the belief) Let us formalize this type of reasoning .... 40 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. 41 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. Definition 12 A strategy si ∈ Si of player i is a best response to a belief s−i ∈ S−i if ui(si, s−i) ≥ ui(si , s−i) for all si ∈ Si 41 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. Definition 12 A strategy si ∈ Si of player i is a best response to a belief s−i ∈ S−i if ui(si, s−i) ≥ ui(si , s−i) for all si ∈ Si Claim 3 A rational player who believes that his opponents will play s−i ∈ S−i always chooses a best response to s−i ∈ S−i. 41 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. Definition 12 A strategy si ∈ Si of player i is a best response to a belief s−i ∈ S−i if ui(si, s−i) ≥ ui(si , s−i) for all si ∈ Si Claim 3 A rational player who believes that his opponents will play s−i ∈ S−i always chooses a best response to s−i ∈ S−i. Definition 13 A strategy si ∈ Si is never best response if it is not a best response to any belief s−i ∈ S−i. 41 Belief & Best Response Definition 11 A belief of player i is a pure strategy profile s−i ∈ S−i of his opponents. Definition 12 A strategy si ∈ Si of player i is a best response to a belief s−i ∈ S−i if ui(si, s−i) ≥ ui(si , s−i) for all si ∈ Si Claim 3 A rational player who believes that his opponents will play s−i ∈ S−i always chooses a best response to s−i ∈ S−i. Definition 13 A strategy si ∈ Si is never best response if it is not a best response to any belief s−i ∈ S−i. A rational player never plays any strategy that is never best response. 41 Best Response vs Strict Dominance Proposition 1 If si is strictly dominated for player i, then it is never best response. 42 Best Response vs Strict Dominance Proposition 1 If si is strictly dominated for player i, then it is never best response. The opposite does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 Here A is never best response but is strictly dominated neither by B, nor by C. 42 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 14 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 14 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. We say that a game is solvable by rationalizability if it has a unique rationalizable equilibrium. 43 Elimination of Stupid Strategies = Rationalizability Using similar iterated reasoning as for IESDS, strategies that are never best response can be iteratively eliminated. Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 14 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. We say that a game is solvable by rationalizability if it has a unique rationalizable equilibrium. (Warning: For some reasons, rationalizable strategies are almost always defined using mixed strategies!) 43 Rationalizability Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 44 Rationalizability Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only rationalizable equilibrium. 44 Rationalizability Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only rationalizable equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 44 Rationalizability Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only rationalizable equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 all strategies are rationalizable. 44 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. What is a best response of player 1 to a given q2 ? 45 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. What is a best response of player 1 to a given q2 ? Solve δu1 δq1 = θ − 2q1 − q2 = 0, which gives that q1 = (θ − q2)/2 is the only best response of player 1 to q2. Similarly, q2 = (θ − q1)/2 is the only best response of player 2 to q1. 45 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. What is a best response of player 1 to a given q2 ? Solve δu1 δq1 = θ − 2q1 − q2 = 0, which gives that q1 = (θ − q2)/2 is the only best response of player 1 to q2. Similarly, q2 = (θ − q1)/2 is the only best response of player 2 to q1. Since q2 ≥ 0, we obtain that q1 is never best response iff q1 > θ/2. Similarly q2 is never best response iff q2 > θ/2. 45 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. What is a best response of player 1 to a given q2 ? Solve δu1 δq1 = θ − 2q1 − q2 = 0, which gives that q1 = (θ − q2)/2 is the only best response of player 1 to q2. Similarly, q2 = (θ − q1)/2 is the only best response of player 2 to q1. Since q2 ≥ 0, we obtain that q1 is never best response iff q1 > θ/2. Similarly q2 is never best response iff q2 > θ/2. Thus R1 1 = R1 2 = [0, θ/2]. 45 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Now, in G1 Rat , we still have that q1 = (θ − q2)/2 is the best response to q2, and q2 = (θ − q1)/2 the best resp. to q1 46 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Now, in G1 Rat , we still have that q1 = (θ − q2)/2 is the best response to q2, and q2 = (θ − q1)/2 the best resp. to q1 Since q2 ∈ R1 2 = [0, θ/2], we obtain that q1 is never best response iff q1 ∈ [0, θ/4) Similarly q2 is never best response iff q2 ∈ [0, θ/4) 46 Cournot Duopoly G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Now, in G1 Rat , we still have that q1 = (θ − q2)/2 is the best response to q2, and q2 = (θ − q1)/2 the best resp. to q1 Since q2 ∈ R1 2 = [0, θ/2], we obtain that q1 is never best response iff q1 ∈ [0, θ/4) Similarly q2 is never best response iff q2 ∈ [0, θ/4) Thus R2 1 = R2 2 = [θ/4, θ/2]. .... 46 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. In general, after 2k iterations we have R2k i = R2k i = [ k , rk ] where rk = (θ − k−1)/2 for k ≥ 1 k = (θ − rk )/2 for k ≥ 1 and 0 = 0 47 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. In general, after 2k iterations we have R2k i = R2k i = [ k , rk ] where rk = (θ − k−1)/2 for k ≥ 1 k = (θ − rk )/2 for k ≥ 1 and 0 = 0 Solving the recurrence we obtain k = θ/3 − 1 4 k θ/3 rk = θ/3 + 1 4 k−1 θ/6 47 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. In general, after 2k iterations we have R2k i = R2k i = [ k , rk ] where rk = (θ − k−1)/2 for k ≥ 1 k = (θ − rk )/2 for k ≥ 1 and 0 = 0 Solving the recurrence we obtain k = θ/3 − 1 4 k θ/3 rk = θ/3 + 1 4 k−1 θ/6 Hence, limk→∞ k = limk→∞ rk = θ/3 and thus (θ/3, θ/3) is the only rationalizable equilibrium. 47 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Are qi = θ/3 Pareto optimal? 48 Cournot Duopoly (cont.) G = (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 = (κ − c1)q1 − q2 1 − q1q2 u2(q1, q2) = q2(κ − q2 − q1) − q2c2 = (κ − c2)q2 − q2 2 − q2q1 Assume for simplicity that c1 = c2 = c and denote θ = κ − c. Are qi = θ/3 Pareto optimal? NO! u1(θ/3, θ/3) = u2(θ/3, θ/3) = θ2 /9 but u1(θ/4, θ/4) = u2(θ/4, θ/4) = θ2 /8 48 IESDS vs Rationalizability in Pure Strategies Theorem 15 Assume that S is finite. Then for all k we have that Rk i ⊆ Dk i . That is, in particular, all rationalizable strategies survive IESDS. 49 IESDS vs Rationalizability in Pure Strategies Theorem 15 Assume that S is finite. Then for all k we have that Rk i ⊆ Dk i . That is, in particular, all rationalizable strategies survive IESDS. The opposite inclusion does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 49 IESDS vs Rationalizability in Pure Strategies Theorem 15 Assume that S is finite. Then for all k we have that Rk i ⊆ Dk i . That is, in particular, all rationalizable strategies survive IESDS. The opposite inclusion does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 Recall that A is never best response but is strictly dominated by neither B, nor C. That is, A survives IESDS but is not rationalizable. 49 IESDS vs Rationalizability in Pure Strategies Theorem 15 Assume that S is finite. Then for all k we have that Rk i ⊆ Dk i . That is, in particular, all rationalizable strategies survive IESDS. The opposite inclusion does not have to be true in pure strategies: X Y A 1, 1 1, 1 B 2, 1 0, 1 C 0, 1 2, 1 Recall that A is never best response but is strictly dominated by neither B, nor C. That is, A survives IESDS but is not rationalizable. 49 Proof of Theorem 15 By induction on k. 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . By the same reason, si is a best response to s−i in Gk−2 Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . By the same reason, si is a best response to s−i in Gk−2 Rat . By the same reason, si is a best response to s−i in Gk−3 Rat . 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . By the same reason, si is a best response to s−i in Gk−2 Rat . By the same reason, si is a best response to s−i in Gk−3 Rat . · · · By the same reason, si is a best response to s−i in G0 Rat = G. 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . By the same reason, si is a best response to s−i in Gk−2 Rat . By the same reason, si is a best response to s−i in Gk−3 Rat . · · · By the same reason, si is a best response to s−i in G0 Rat = G. However, then si is a best response to s−i in Gk DS . (This follows from the fact that the “best response” relationship of si and s−i is preserved by removing arbitrarily many other strategies.) 50 Proof of Theorem 15 By induction on k. For k = 0 we have that R0 i = Si = D0 i by definition. Assume that Rk i ⊆ Dk i for some k ≥ 0 and prove that Rk+1 i ⊆ Dk+1 i . Let si ∈ Rk+1 i . Then there must be s−i ∈ Rk −i such that si is a best response to s−i in Gk Rat (This follows from the fact that si has not been eliminated in Gk Rat .) But then si is a best response to s−i in Gk−1 Rat as well! Indeed, let si be a best response to s−i in Gk−1 Rat . Then si ∈ Rk i since si is not eliminated in Gk−1 Rat . But then ui(si, s−i) ≥ ui(si , s−i) since si is a best response to s−i in Gk Rat . Thus si is a best response to s−i in Gk−1 Rat . By the same reason, si is a best response to s−i in Gk−2 Rat . By the same reason, si is a best response to s−i in Gk−3 Rat . · · · By the same reason, si is a best response to s−i in G0 Rat = G. However, then si is a best response to s−i in Gk DS . (This follows from the fact that the “best response” relationship of si and s−i is preserved by removing arbitrarily many other strategies.) Thus si is not strictly dominated in Gk DS and si ∈ Dk+1 i . 50 Pinning Down Beliefs – Nash Equilibria Criticism of previous approaches: Strictly dominant strategy equilibria often do not exist IESDS and rationalizability may not remove any strategies 51 Pinning Down Beliefs – Nash Equilibria Criticism of previous approaches: Strictly dominant strategy equilibria often do not exist IESDS and rationalizability may not remove any strategies Typical example is Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 Here all strategies are equally reasonable according to the above concepts. 51 Pinning Down Beliefs – Nash Equilibria Criticism of previous approaches: Strictly dominant strategy equilibria often do not exist IESDS and rationalizability may not remove any strategies Typical example is Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 Here all strategies are equally reasonable according to the above concepts. But are all strategy profiles really equally reasonable? 51 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. 52 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. By Claim 3, each player plays a best response to his beliefs. 52 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. By Claim 3, each player plays a best response to his beliefs. Is (O, F) as reasonable as (O, O) in this respect? 52 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. By Claim 3, each player plays a best response to his beliefs. Is (O, F) as reasonable as (O, O) in this respect? Note that if player 1 believes that player 2 plays O, then playing O is reasonable, and if player 2 believes that player 1 plays F, then playing F is reasonable. But such beliefs cannot be correct together! 52 Pinning Down Beliefs – Nash Equilibria O F O 2, 1 0, 0 F 0, 0 1, 2 Assume that each player has a belief about strategies of other players. By Claim 3, each player plays a best response to his beliefs. Is (O, F) as reasonable as (O, O) in this respect? Note that if player 1 believes that player 2 plays O, then playing O is reasonable, and if player 2 believes that player 1 plays F, then playing F is reasonable. But such beliefs cannot be correct together! (O, O) can be obtained as a profile where each player plays the best response to his belief and the beliefs are correct. 52 Nash Equilibrium Nash equilibrium can be defined as a set of beliefs (one for each player) and a strategy profile in which every player plays a best response to his belief and each strategy of each player is consistent with beliefs of his opponents. 53 Nash Equilibrium Nash equilibrium can be defined as a set of beliefs (one for each player) and a strategy profile in which every player plays a best response to his belief and each strategy of each player is consistent with beliefs of his opponents. A usual definition is following: Definition 16 A pure-strategy profile s∗ = (s∗ 1 , . . . , s∗ n) ∈ S is a (pure) Nash equilibrium if s∗ i is a best response to s∗ −i for each i ∈ N, that is ui(s∗ i , s∗ −i) ≥ ui(si, s∗ −i) for all si ∈ Si and all i ∈ N 53 Nash Equilibrium Nash equilibrium can be defined as a set of beliefs (one for each player) and a strategy profile in which every player plays a best response to his belief and each strategy of each player is consistent with beliefs of his opponents. A usual definition is following: Definition 16 A pure-strategy profile s∗ = (s∗ 1 , . . . , s∗ n) ∈ S is a (pure) Nash equilibrium if s∗ i is a best response to s∗ −i for each i ∈ N, that is ui(s∗ i , s∗ −i) ≥ ui(si, s∗ −i) for all si ∈ Si and all i ∈ N Note that this definition is equivalent to the previous one in the sense that s∗ −i may be considered as the (consistent) belief of player i to which he plays a best response s∗ i 53 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 54 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only Nash equilibrium. 54 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only Nash equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 54 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only Nash equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 only (O, O) and (F, F) are Nash equilibria. 54 Nash Equilibria Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the only Nash equilibrium. In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 only (O, O) and (F, F) are Nash equilibria. In Cournot Duopoly, (θ/3, θ/3) is the only Nash equilibrium. (Best response relations: q1 = (θ − q2)/2 and q2 = (θ − q1)/2 are both satisfied only by q1 = q2 = θ/3) 54 Example: Stag Hunt Story: Two (in some versions more than two) hunters, players 1 and 2, can each choose to hunt stag (S) = a large tasty meal hare (H) = also tasty but small 55 Example: Stag Hunt Story: Two (in some versions more than two) hunters, players 1 and 2, can each choose to hunt stag (S) = a large tasty meal hare (H) = also tasty but small Hunting stag is much more demanding and forces of both players need to be joined (hare can be hunted individually) 55 Example: Stag Hunt Story: Two (in some versions more than two) hunters, players 1 and 2, can each choose to hunt stag (S) = a large tasty meal hare (H) = also tasty but small Hunting stag is much more demanding and forces of both players need to be joined (hare can be hunted individually) Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 55 Example: Stag Hunt Story: Two (in some versions more than two) hunters, players 1 and 2, can each choose to hunt stag (S) = a large tasty meal hare (H) = also tasty but small Hunting stag is much more demanding and forces of both players need to be joined (hare can be hunted individually) Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? 55 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? If each player believes that the other one will go for hare, then (H, H) is a reasonable outcome ⇒ a society of individualists who do not cooperate at all. 56 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? If each player believes that the other one will go for hare, then (H, H) is a reasonable outcome ⇒ a society of individualists who do not cooperate at all. If each player believes that the other will cooperate, then this anticipation is self-fulfilling and results in what can be called a cooperative society. 56 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? If each player believes that the other one will go for hare, then (H, H) is a reasonable outcome ⇒ a society of individualists who do not cooperate at all. If each player believes that the other will cooperate, then this anticipation is self-fulfilling and results in what can be called a cooperative society. This is supposed to explain that in real world there are societies that have similar endowments, access to technology and physical environment but have very different achievements, all because of self-fulfilling beliefs (or norms of behavior). 56 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? Another point of view: (H, H) is less risky 57 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? Another point of view: (H, H) is less risky Minimum secured by playing S is 0 as opposed to 3 by playing H (We will get to this minimax principle later) 57 Example: Stag Hunt Strategy-form game model: N = {1, 2}, S1 = S2 = {S, H}, the payoff: S H S 5, 5 0, 3 H 3, 0 3, 3 Two NE: (S, S), and (H, H), where the former Pareto dominates the latter! Which one is more reasonable? Another point of view: (H, H) is less risky Minimum secured by playing S is 0 as opposed to 3 by playing H (We will get to this minimax principle later) So it seems to be rational to expect (H, H) (?) 57 Nash Equilibria vs Previous Concepts Theorem 17 1. If s∗ is a strictly dominant strategy equilibrium, then it is the unique Nash equilibrium. 2. Each Nash equilibrium is rationalizable and survives IESDS. 3. If S is finite, neither rationalizability, nor IESDS creates new Nash equilibria. Proof: Homework! 58 Nash Equilibria vs Previous Concepts Theorem 17 1. If s∗ is a strictly dominant strategy equilibrium, then it is the unique Nash equilibrium. 2. Each Nash equilibrium is rationalizable and survives IESDS. 3. If S is finite, neither rationalizability, nor IESDS creates new Nash equilibria. Proof: Homework! Corollary 18 Assume that S is finite. If rationalizability or IESDS result in a unique strategy profile, then this profile is a Nash equilibrium. 58 Interpretations of Nash Equilibria Except the two definitions, usual interpretations are following: When the goal is to give advice to all of the players in a game (i.e., to advise each player what strategy to choose), any advice that was not an equilibrium would have the unsettling property that there would always be some player for whom the advice was bad, in the sense that, if all other players followed the parts of the advice directed to them, it would be better for some player to do differently than he was advised. If the advice is an equilibrium, however, this will not be the case, because the advice to each player is the best response to the advice given to the other players. 59 Interpretations of Nash Equilibria Except the two definitions, usual interpretations are following: When the goal is to give advice to all of the players in a game (i.e., to advise each player what strategy to choose), any advice that was not an equilibrium would have the unsettling property that there would always be some player for whom the advice was bad, in the sense that, if all other players followed the parts of the advice directed to them, it would be better for some player to do differently than he was advised. If the advice is an equilibrium, however, this will not be the case, because the advice to each player is the best response to the advice given to the other players. When the goal is prediction rather than prescription, a Nash equilibrium can also be interpreted as a potential stable point of a dynamic adjustment process in which individuals adjust their behavior to that of the other players in the game, searching for strategy choices that will give them better results. 59 Static Games of Complete Information Mixed Strategies 60 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 There are no strictly dominant pure strategies 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 There are no strictly dominant pure strategies No strategy is strictly dominated (IESDS removes nothing) 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 There are no strictly dominant pure strategies No strategy is strictly dominated (IESDS removes nothing) Each strategy is a best response to some strategy of the opponent (rationalizability removes nothing) 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 There are no strictly dominant pure strategies No strategy is strictly dominated (IESDS removes nothing) Each strategy is a best response to some strategy of the opponent (rationalizability removes nothing) No pure Nash equilibria: No pure strategy profile allows each player to play a best response to the strategy of the other player 61 Let’s Mix It As pointed out before, neither of the solution concepts has to exist in pure strategies Example: Rock-Paper-sCissors R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 There are no strictly dominant pure strategies No strategy is strictly dominated (IESDS removes nothing) Each strategy is a best response to some strategy of the opponent (rationalizability removes nothing) No pure Nash equilibria: No pure strategy profile allows each player to play a best response to the strategy of the other player How to solve this? Let the players randomize their choice of pure strategies .... 61 Probability Distributions Definition 19 Let A be a finite set. A probability distribution over A is a function σ : A → [0, 1] such that a∈A σ(a) = 1. 62 Probability Distributions Definition 19 Let A be a finite set. A probability distribution over A is a function σ : A → [0, 1] such that a∈A σ(a) = 1. We denote by ∆(A) the set of all probability distributions over A. 62 Probability Distributions Definition 19 Let A be a finite set. A probability distribution over A is a function σ : A → [0, 1] such that a∈A σ(a) = 1. We denote by ∆(A) the set of all probability distributions over A. We denote by supp(σ) the support of σ, that is the set of all a ∈ A satisfying σ(a) > 0. 62 Probability Distributions Definition 19 Let A be a finite set. A probability distribution over A is a function σ : A → [0, 1] such that a∈A σ(a) = 1. We denote by ∆(A) the set of all probability distributions over A. We denote by supp(σ) the support of σ, that is the set of all a ∈ A satisfying σ(a) > 0. Example 20 Consider A = {a, b, c} and a function σ : A → [0, 1] such that σ(a) = 1 4 , σ(b) = 3 4 , and σ(c) = 0. Then σ ∈ ∆(A) and supp(σ) = {a, b}. 62 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. We define Σ := Σ1 × · · · × Σn, the set of all mixed strategy profiles. 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. We define Σ := Σ1 × · · · × Σn, the set of all mixed strategy profiles. Recall that by Σ−i we denote the set Σ1 × · · · Σi−1 × Σi+1 × · · · × Σn 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. We define Σ := Σ1 × · · · × Σn, the set of all mixed strategy profiles. Recall that by Σ−i we denote the set Σ1 × · · · Σi−1 × Σi+1 × · · · × Σn Elements of Σ−i are denoted by σ−i = (σ1, . . . , σi−1, σi+1, . . . , σn). 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. We define Σ := Σ1 × · · · × Σn, the set of all mixed strategy profiles. Recall that by Σ−i we denote the set Σ1 × · · · Σi−1 × Σi+1 × · · · × Σn Elements of Σ−i are denoted by σ−i = (σ1, . . . , σi−1, σi+1, . . . , σn). We identify each si ∈ Si with a mixed strategy σ that assigns probability one to si (and zero to other pure strategies). 63 Mixed Strategies Let us fix a strategic-form game G = (N, (Si)i∈N , (ui)i∈N). From now on, assume that all Si are finite! Definition 21 A mixed strategy of player i is a probability distribution σ ∈ ∆(Si) over Si. We denote by Σi = ∆(Si) the set of all mixed strategies of player i. We define Σ := Σ1 × · · · × Σn, the set of all mixed strategy profiles. Recall that by Σ−i we denote the set Σ1 × · · · Σi−1 × Σi+1 × · · · × Σn Elements of Σ−i are denoted by σ−i = (σ1, . . . , σi−1, σi+1, . . . , σn). We identify each si ∈ Si with a mixed strategy σ that assigns probability one to si (and zero to other pure strategies). For example, in rock-paper-scissors, the pure strategy R corresponds to σi which satisfies σi(X) =    1 X = R 0 otherwise 63 Mixed Strategy Profiles Let σ = (σ1, . . . , σn) be a mixed strategy profile. 64 Mixed Strategy Profiles Let σ = (σ1, . . . , σn) be a mixed strategy profile. Intuitively, we assume that each player i randomly chooses his pure strategy according to σi and independently of his opponents. 64 Mixed Strategy Profiles Let σ = (σ1, . . . , σn) be a mixed strategy profile. Intuitively, we assume that each player i randomly chooses his pure strategy according to σi and independently of his opponents. Thus for s = (s1, . . . , sn) ∈ S = S1 × · · · × Sn we have that σ(s) := n i=1 σi(si) is the probability that the players choose the pure strategy profile s according to the mixed strategy profile σ, 64 Mixed Strategy Profiles Let σ = (σ1, . . . , σn) be a mixed strategy profile. Intuitively, we assume that each player i randomly chooses his pure strategy according to σi and independently of his opponents. Thus for s = (s1, . . . , sn) ∈ S = S1 × · · · × Sn we have that σ(s) := n i=1 σi(si) is the probability that the players choose the pure strategy profile s according to the mixed strategy profile σ, and σ−i(s−i) := n k i σk (sk ) is the probability that the opponents of player i choose s−i ∈ S−i when they play according to the mixed strategy profile σ−i ∈ Σ−i. (We abuse notation a bit here: σ denotes two things, a vector of mixed strategies as well as a probability distribution on S (the same for σ−i) 64 Mixed Strategies – Example R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 65 Mixed Strategies – Example R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 An example of a mixed strategy σ1: σ1(R) = 1 2 , σ1(P) = 1 3 , σ1(C) = 1 6 . 65 Mixed Strategies – Example R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 An example of a mixed strategy σ1: σ1(R) = 1 2 , σ1(P) = 1 3 , σ1(C) = 1 6 . Sometimes we write σ1 as (1 2 (R), 1 3 (P), 1 6 (C)), or only (1 2 , 1 3 , 1 6 ) if the order of pure strategies is fixed. 65 Mixed Strategies – Example R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 An example of a mixed strategy σ1: σ1(R) = 1 2 , σ1(P) = 1 3 , σ1(C) = 1 6 . Sometimes we write σ1 as (1 2 (R), 1 3 (P), 1 6 (C)), or only (1 2 , 1 3 , 1 6 ) if the order of pure strategies is fixed. Consider a mixed strategy profile (σ1, σ2) where σ1 = (1 2 (R), 1 3 (P), 1 6 (C)) and σ2 = (1 3 (R), 2 3 (P), 0(C)). 65 Mixed Strategies – Example R P C R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 C −1, 1 1, −1 0, 0 An example of a mixed strategy σ1: σ1(R) = 1 2 , σ1(P) = 1 3 , σ1(C) = 1 6 . Sometimes we write σ1 as (1 2 (R), 1 3 (P), 1 6 (C)), or only (1 2 , 1 3 , 1 6 ) if the order of pure strategies is fixed. Consider a mixed strategy profile (σ1, σ2) where σ1 = (1 2 (R), 1 3 (P), 1 6 (C)) and σ2 = (1 3 (R), 2 3 (P), 0(C)). Then the probability σ(R, P) that the pure strategy profile (R, P) will be chosen by players playing the mixed profile (σ1, σ2) is σ1(R) · σ2(P) = 1 2 · 2 3 = 1 3 65 Expected Payoff ... but now what is the suitable notion of payoff? 66 Expected Payoff ... but now what is the suitable notion of payoff? Definition 22 The expected payoff of player i under a mixed strategy profile σ ∈ Σ is ui(σ) := s∈S σ(s)ui(s)  = s∈S n k=1 σk (sk )ui(s)   I.e., it is the "weighted average" of what player i wins under each pure strategy profile s, weighted by the probability of that profile. 66 Expected Payoff ... but now what is the suitable notion of payoff? Definition 22 The expected payoff of player i under a mixed strategy profile σ ∈ Σ is ui(σ) := s∈S σ(s)ui(s)  = s∈S n k=1 σk (sk )ui(s)   I.e., it is the "weighted average" of what player i wins under each pure strategy profile s, weighted by the probability of that profile. Assumption: Every rational player strives to maximize his own expected payoff. (This assumption is not always completely convincing ...) 66 Expected Payoff – Example Matching Pennies: H T H 1, −1 −1, 1 T −1, 1 1, −1 Each player secretly turns a penny to heads or tails, and then they reveal their choices simultaneously. If the pennies match, player 1 (row) wins, if they do not match, player 2 (column) wins. Consider σ1 = (1 3 (H), 2 3 (T)) and σ2 = (1 4 (H), 3 4 (T)) u1(σ1, σ2) = (X,Y)∈{H,T}2 σ1(X)σ2(Y)u1(X, Y) = 1 3 1 4 1 + 1 3 3 4 (−1) + 2 3 1 4 (−1) + 2 3 3 4 1 = 1 6 u2(σ1, σ2) = (X,Y)∈{H,T}2 σ1(X)σ2(Y)u2(X, Y) = 1 3 1 4 (−1) + 1 3 3 4 1 + 2 3 1 4 1 + 2 3 3 4 (−1) = − 1 6 67 "Decomposition" of Expected Payoff Consider the matching pennies example from the previous slide: H T H 1, −1 −1, 1 T −1, 1 1, −1 together with some mixed strategies σ1 and σ2. We prove the following important property of the expected payoff: u1(σ1, σ2) = X∈{H,T} σ1(X)u1(X, σ2) An intuition behind this equality is following: u1(σ1, σ2) is the expected payoff of player 1 in the following experiment: Both players simultaneously and independently choose their pure strategies X, Y according to σ1, σ2, resp., and then player 1 collects his payoff u1(X, Y). X∈{H,T} σ1(X)u1(X, σ2) is the expected payoff of player 1 in the following: Player 1 chooses his pure strategy X and then uses it against the mixed strategy σ2 of player 2. Then player 2 chooses Y according to σ2 independently of X, and player 1 collects the payoff u1(X, Y). As Y does not depend on X in neither experiment, we obtain the above equality of expected payoffs. 68 "Decomposition" of Expected Payoff Consider the matching pennies example from the previous slide: H T H 1, −1 −1, 1 T −1, 1 1, −1 together with some mixed strategies σ1 and σ2. A formal proof is straightforward: u1(σ1, σ2) = (X,Y)∈{H,T}2 σ1(X)σ2(Y)u1(X, Y) = X∈{H,T} Y∈{H,T} σ1(X)σ2(Y)u1(X, Y) = X∈{H,T} σ1(X) Y∈{H,T} σ2(Y)u1(X, Y) = X∈{H,T} σ1(X)u1(X, σ2) (In the last equality we used the fact that X is identified with a mixed strategy assigning one to X.) 69 "Decomposition" of Expected Payoff Consider the matching pennies example from the previous slide: H T H 1, −1 −1, 1 T −1, 1 1, −1 together with some mixed strategies σ1 and σ2. Similarly, u1(σ1, σ2) = (X,Y)∈{H,T}2 σ1(X)σ2(Y)u1(X, Y) = X∈{H,T} Y∈{H,T} σ1(X)σ2(Y)u1(X, Y) = Y∈{H,T} X∈{H,T} σ1(X)σ2(Y)u1(X, Y) = Y∈{H,T} σ2(Y) X∈{H,T} σ1(X)u1(X, Y) = Y∈{H,T} σ2(Y)u1(σ1, Y) 70 Expected Payoff – "Decomposition" in General Lemma 23 For every mixed strategy profile σ ∈ Σ and every k ∈ N we have ui(σ) = sk ∈Sk σk (sk ) · ui(sk , σ−k ) = s−k ∈S−k σ−k (s−k ) · ui(σk , s−k ) 71 Expected Payoff – "Decomposition" in General Lemma 23 For every mixed strategy profile σ ∈ Σ and every k ∈ N we have ui(σ) = sk ∈Sk σk (sk ) · ui(sk , σ−k ) = s−k ∈S−k σ−k (s−k ) · ui(σk , s−k ) Proof: ui(σ) = s∈S σ(s)ui(s) = s∈S n =1 σ (s )ui(s) = s∈S σk (sk ) n k σ (s )ui(s) = sk ∈Sk s−k ∈S−k σk (sk ) n k σ (s )ui(sk , s−k ) = sk ∈Sk s−k ∈S−k σk (sk )σ−k (s−k )ui(sk , s−k ) 71 Proof of Lemma 23 (cont.) The first equality: ui(σ) = sk ∈Sk s−k ∈S−k σk (sk )σ−k (s−k )ui(sk , s−k ) = sk ∈Sk σk (sk ) s−k ∈S−k σ−k (s−k )ui(sk , s−k ) = sk ∈Sk σk (sk )ui(sk , σ−k ) 72 Proof of Lemma 23 (cont.) The first equality: ui(σ) = sk ∈Sk s−k ∈S−k σk (sk )σ−k (s−k )ui(sk , s−k ) = sk ∈Sk σk (sk ) s−k ∈S−k σ−k (s−k )ui(sk , s−k ) = sk ∈Sk σk (sk )ui(sk , σ−k ) The second equality: ui(σ) = sk ∈Sk s−k ∈S−k σk (sk )σ−k (s−k )ui(sk , s−k ) = s−k ∈S−k sk ∈Sk σk (sk )σ−k (s−k )ui(sk , s−k ) = s−k ∈S−k σ−k (s−k ) sk ∈Sk σk (sk )ui(sk , s−k ) = s−k ∈S−k σ−k (s−k )ui(σk , s−k ) 72 Expected Payoff – Pure Strategy Bounds Corollary 24 For all i, k ∈ N and σ ∈ Σ we have that minsk ∈Sk ui(sk , σ−k ) ≤ ui(σ) ≤ maxsk ∈Sk ui(sk , σ−k ) mins−k ∈S−k ui(σk , s−k ) ≤ ui(σ) ≤ maxs−k ∈S−k ui(σk , s−k ) 73 Expected Payoff – Pure Strategy Bounds Corollary 24 For all i, k ∈ N and σ ∈ Σ we have that minsk ∈Sk ui(sk , σ−k ) ≤ ui(σ) ≤ maxsk ∈Sk ui(sk , σ−k ) mins−k ∈S−k ui(σk , s−k ) ≤ ui(σ) ≤ maxs−k ∈S−k ui(σk , s−k ) Proof. We prove ui(σ) ≤ maxsk ∈Sk ui(sk , σ−k ) the rest is similar. Define B := maxsk ∈Sk ui(sk , σ−k ). Then ui(σ) = sk ∈Sk σk (sk ) · ui(sk , σ−k ) ≤ sk ∈Sk σk (sk ) · B = B 73 Solution Concepts We revisit the following solution concepts in mixed strategies: strict dominant strategy equilibrium IESDS equilibrium rationalizable equilibria Nash equilibria 74 Solution Concepts We revisit the following solution concepts in mixed strategies: strict dominant strategy equilibrium IESDS equilibrium rationalizable equilibria Nash equilibria From now on, when I say a strategy I implicitly mean a mixed strategy. 74 Solution Concepts We revisit the following solution concepts in mixed strategies: strict dominant strategy equilibrium IESDS equilibrium rationalizable equilibria Nash equilibria From now on, when I say a strategy I implicitly mean a mixed strategy. In order to deal with efficiency issues we assume that the size of the game G is defined by |G| := |N| + i∈N |Si| + i∈N |ui| where |ui| = s∈S |ui(s)| and |ui(s)| is the length of a binary encoding of ui(s) (we assume that rational numbers are encoded as quotients of two binary integers) Note that, in particular, |G| > |S|. 74 Strict Dominance in Mixed Strategies Definition 25 Let σi, σi ∈ Σi be (mixed) strategies of player i. Then σi is strictly dominated by σi (write σi σi) if ui(σi, σ−i) > ui(σi , σ−i) for all σ−i ∈ Σ−i 75 Strict Dominance in Mixed Strategies Definition 25 Let σi, σi ∈ Σi be (mixed) strategies of player i. Then σi is strictly dominated by σi (write σi σi) if ui(σi, σ−i) > ui(σi , σ−i) for all σ−i ∈ Σ−i Example 26 X Y A 3 0 B 0 3 C 1 1 Is there a strictly dominated strategy? 75 Strict Dominance in Mixed Strategies Definition 25 Let σi, σi ∈ Σi be (mixed) strategies of player i. Then σi is strictly dominated by σi (write σi σi) if ui(σi, σ−i) > ui(σi , σ−i) for all σ−i ∈ Σ−i Example 26 X Y A 3 0 B 0 3 C 1 1 Is there a strictly dominated strategy? Question: Is there a game with at least one strictly dominated strategy but without strictly dominated pure strategies? 75 Strictly Dominant Strategy Equilibrium Definition 27 σi ∈ Σi is strictly dominant if every other mixed strategy of player i is strictly dominated by σi. 76 Strictly Dominant Strategy Equilibrium Definition 27 σi ∈ Σi is strictly dominant if every other mixed strategy of player i is strictly dominated by σi. Definition 28 A strategy profile σ ∈ Σ is a strictly dominant strategy equilibrium if σi ∈ Σi is strictly dominant for all i ∈ N. 76 Strictly Dominant Strategy Equilibrium Definition 27 σi ∈ Σi is strictly dominant if every other mixed strategy of player i is strictly dominated by σi. Definition 28 A strategy profile σ ∈ Σ is a strictly dominant strategy equilibrium if σi ∈ Σi is strictly dominant for all i ∈ N. Proposition 2 If the strictly dominant strategy equilibrium exists, it is unique, all its strategies are pure, and rational players will play it. Proof. Let σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σi be a strictly dominant strategy equilibrium. By Corollary 24, for every i ∈ N, there must exist si ∈ Si such that ui(σ∗ ) ≤ ui(si, σ∗ −i ). But then σ∗ i = si since σ∗ i is strictly dominant. 76 Computing Strictly Dominant Strategy Equilibrium How to decide whether there is a strictly dominant strategy equilibrium s = (s1, . . . , sn) ∈ S ? I.e. whether for a given si ∈ Si, all σi ∈ Σi {si} and all σ−i ∈ Σ−i : ui(si, σ−i) > ui(σi, σ−i) 77 Computing Strictly Dominant Strategy Equilibrium How to decide whether there is a strictly dominant strategy equilibrium s = (s1, . . . , sn) ∈ S ? I.e. whether for a given si ∈ Si, all σi ∈ Σi {si} and all σ−i ∈ Σ−i : ui(si, σ−i) > ui(σi, σ−i) There are some serious issues here: Obviously there are uncountably many possible σi and σ−i. ui(σi, σ−i) is nonlinear, and for more than two players even ui(si, σ−i) is nonlinear in probabilities assigned to pure strategies. 77 Computing Strictly Dominant Strategy Equilibrium First, we prove the following useful proposition using Lemma 23: Lemma 29 σi strictly dominates σi iff for all pure strategy profiles s−i ∈ S−i: ui(σi , s−i) > ui(σi, s−i) (1) 78 Computing Strictly Dominant Strategy Equilibrium First, we prove the following useful proposition using Lemma 23: Lemma 29 σi strictly dominates σi iff for all pure strategy profiles s−i ∈ S−i: ui(σi , s−i) > ui(σi, s−i) (1) Proof. ‘⇒’ direction is trivial, let us prove ‘⇐’. Assume that (1) is true for all pure strategy profiles s−i ∈ S−i. Then, by Lemma 23, ui(σi, σ−i) = s−i ∈S−i σ−i(s−i)ui(σi, s−i) < s−i ∈S−i σ−i(s−i)ui(σi , s−i) = ui(σi , σ−i) holds for all mixed strategy profiles σ−i ∈ Σ−i. In other words, it suffices to check the strict dominance only with respect to all pure profiles of opponents. 78 Computing Strictly Dominant Strategy Equilibrium How to decide whether for a given si ∈ Si, all σi ∈ Σi {si} and all s−i ∈ S−i we have ui(si, s−i) > ui(σi, s−i) ? Lemma 30 ui(si, s−i) > ui(σi, s−i) for all σi ∈ Σi {si} and all s−i ∈ S−i iff ui(si, s−i) > ui(si , s−i) for all si ∈ Si {si} and all s−i ∈ S−i. Proof. ‘⇒’ direction is trivial, let us prove ‘⇐’. Assume ui(si, s−i) > ui(si , s−i) for all si ∈ Si {si} and all s−i ∈ S−i. Given σi ∈ Σi {si}, we have by Lemma 23, ui(σi, s−i) = si ∈Si σi(si )ui(si , s−i) < si ∈Si σi(si )ui(si, s−i) = ui(si, s−i) The inequality follows from our assumption and the fact that σi(si ) > 0 for at least one si si (due to σi ∈ Σi {si}). 79 Computing Strictly Dominant Strategy Equilibrium How to decide whether for a given si ∈ Si, all σi ∈ Σi {si} and all s−i ∈ S−i we have ui(si, s−i) > ui(σi, s−i) ? Lemma 30 ui(si, s−i) > ui(σi, s−i) for all σi ∈ Σi {si} and all s−i ∈ S−i iff ui(si, s−i) > ui(si , s−i) for all si ∈ Si {si} and all s−i ∈ S−i. Proof. ‘⇒’ direction is trivial, let us prove ‘⇐’. Assume ui(si, s−i) > ui(si , s−i) for all si ∈ Si {si} and all s−i ∈ S−i. Given σi ∈ Σi {si}, we have by Lemma 23, ui(σi, s−i) = si ∈Si σi(si )ui(si , s−i) < si ∈Si σi(si )ui(si, s−i) = ui(si, s−i) The inequality follows from our assumption and the fact that σi(si ) > 0 for at least one si si (due to σi ∈ Σi {si}). Thus it suffices to check whether ui(si, s−i) > ui(si , s−i) for all si ∈ Si and all s−i ∈ S−i. This can easily be done in time polynomial w.r.t. |G|. 79 IESDS in Mixed Strategies Define a sequence D0 i , D1 i , D2 i , . . . of strategy sets of player i. (Denote by Gk DS the game obtained from G by restricting the pure strategy sets to Dk i , i ∈ N.) 1. Initialize k = 0 and D0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Dk+1 i be the set of all pure strategies of Dk i that are not strictly dominated in Gk DS by mixed strategies. 3. Let k := k + 1 and go to 2. We say that si ∈ Si survives IESDS if si ∈ Dk i for all k = 0, 1, 2, . . . Definition 31 A strategy profile s = (s1, . . . , sn) ∈ S is an IESDS equilibrium if each si survives IESDS. 80 IESDS – Algorithm Note that in step 2 it is not sufficient to consider pure strategies. Consider the following zero sum game: X Y A 3 0 B 0 3 C 1 1 81 IESDS – Algorithm Note that in step 2 it is not sufficient to consider pure strategies. Consider the following zero sum game: X Y A 3 0 B 0 3 C 1 1 C is strictly dominated by (σ1(A), σ1(B), σ1(C)) = (1 2 , 1 2 , 0) but no strategy is strictly dominated in pure strategies. 81 IESDS – Algorithm However, there are uncountably many mixed strategies that may dominate a given pure strategy ... 82 IESDS – Algorithm However, there are uncountably many mixed strategies that may dominate a given pure strategy ... But ui(σ) = ui(σ1, . . . , σn) is linear in each σk (if σ−k is kept fixed)! Indeed, assuming w.l.o.g. that Sk = {1, . . . , mk }, ui(σ) = sk ∈Sk σk (sk ) · ui(sk , σ−k ) = mk =1 σk ( ) · ui( , σ−k ) is the scalar product of the vector σk = (σk (1), . . . , σk (mk )) with the vector (ui(1, σ−k ), . . . , ui(mk , σ−k )), which is linear. So to decide strict dominance, we use linear programming ... 82 Intermezzo: Linear Programming Linear programming is a technique for optimization of a linear objective function, subject to linear (non-strict) inequality constraints. Formally, a linear program in so called canonical form looks like this: maximize m j=1 cjxj subject to m j=1 aijxj ≤ bi 1 ≤ i ≤ n xj ≥ 0 1 ≤ j ≤ m (objective function) (constraints) Here aij, bk and cj are real numbers and xj’s are real variables. A feasible solution is an assignment of real numbers to the variables xj, 1 ≤ j ≤ m, so that the constraints are satisfied. An optimal solution is a feasible solution which maximizes the objective function m j=1 cjxj. 83 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). 84 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). Theorem 32 (Khachiyan, Doklady Akademii Nauk SSSR, 1979) There is an algorithm which for any linear program computes an optimal solution in polynomial time. The algorithm uses so called ellipsoid method. 84 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). Theorem 32 (Khachiyan, Doklady Akademii Nauk SSSR, 1979) There is an algorithm which for any linear program computes an optimal solution in polynomial time. The algorithm uses so called ellipsoid method. In practice, the Khachiyan’s is not used. Usually simplex algorithm is used even though its theoretical complexity is exponential. 84 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). Theorem 32 (Khachiyan, Doklady Akademii Nauk SSSR, 1979) There is an algorithm which for any linear program computes an optimal solution in polynomial time. The algorithm uses so called ellipsoid method. In practice, the Khachiyan’s is not used. Usually simplex algorithm is used even though its theoretical complexity is exponential. There is also a polynomial time algorithm (by Karmarkar) which has better complexity upper bounds than the Khachiyan’s and sometimes works even better than the simplex. 84 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). Theorem 32 (Khachiyan, Doklady Akademii Nauk SSSR, 1979) There is an algorithm which for any linear program computes an optimal solution in polynomial time. The algorithm uses so called ellipsoid method. In practice, the Khachiyan’s is not used. Usually simplex algorithm is used even though its theoretical complexity is exponential. There is also a polynomial time algorithm (by Karmarkar) which has better complexity upper bounds than the Khachiyan’s and sometimes works even better than the simplex. There exist several advanced linear programming solvers (usually parts of larger optimization packages) implementing various heuristics for solving large scale problems, sensitivity analysis, etc. 84 Intermezzo: Complexity of Linear Programming We assume that coefficients aij, bk and cj are encoded in binary (more precisely, as fractions of two integers encoded in binary). Theorem 32 (Khachiyan, Doklady Akademii Nauk SSSR, 1979) There is an algorithm which for any linear program computes an optimal solution in polynomial time. The algorithm uses so called ellipsoid method. In practice, the Khachiyan’s is not used. Usually simplex algorithm is used even though its theoretical complexity is exponential. There is also a polynomial time algorithm (by Karmarkar) which has better complexity upper bounds than the Khachiyan’s and sometimes works even better than the simplex. There exist several advanced linear programming solvers (usually parts of larger optimization packages) implementing various heuristics for solving large scale problems, sensitivity analysis, etc. For more info see http://en.wikipedia.org/wiki/Linear_programming#Solvers_and_scripting_.28programming.29_languages 84 IESDS Algorithm – Strict Dominance Step So how do we use linear programming to decide strict dominance in step 2 of IESDS procedure? I.e. whether for a given si there exists σi such that for all σ−i we have ui(σi, σ−i) > ui(si, σ−i) 85 IESDS Algorithm – Strict Dominance Step So how do we use linear programming to decide strict dominance in step 2 of IESDS procedure? I.e. whether for a given si there exists σi such that for all σ−i we have ui(σi, σ−i) > ui(si, σ−i) Recall that by Lemma 29 we have that σi strictly dominates si iff for all pure strategy profiles s−i ∈ S−i: ui(σi, s−i) > ui(si, s−i) In other words, it suffices to check the strict dominance only with respect to all pure profiles of opponents. 85 IESDS Algorithm – Strict Dominance Step Recall that ui(σi, s−i) = si ∈Si σi(si )ui(si , s−i). 86 IESDS Algorithm – Strict Dominance Step Recall that ui(σi, s−i) = si ∈Si σi(si )ui(si , s−i). So to decide whether si ∈ Si is strictly dominated by some mixed strategy σi, it suffices to solve the following system: si ∈Si xsi · ui(si , s−i) > ui(si, s−i) s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 (Here each variable xsi corresponds to the probability σi(si ) assigned by the strictly dominant strategy σi to si ) 86 IESDS Algorithm – Strict Dominance Step Recall that ui(σi, s−i) = si ∈Si σi(si )ui(si , s−i). So to decide whether si ∈ Si is strictly dominated by some mixed strategy σi, it suffices to solve the following system: si ∈Si xsi · ui(si , s−i) > ui(si, s−i) s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 (Here each variable xsi corresponds to the probability σi(si ) assigned by the strictly dominant strategy σi to si ) Unfortunately, this is a "strict linear program" ... How to deal with the strict inequality? 86 IESDS Algorithm – Complexity Introduce a new variable y to be maximized under the following constraints: si ∈Si xsi · ui(si , s−i) ≥ ui(si, s−i) + y s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 y ≥ 0 Now si is strictly dominated iff a solution maximizing y satisfies y > 0 87 IESDS Algorithm – Complexity Introduce a new variable y to be maximized under the following constraints: si ∈Si xsi · ui(si , s−i) ≥ ui(si, s−i) + y s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 y ≥ 0 Now si is strictly dominated iff a solution maximizing y satisfies y > 0 The size of the above program is polynomial in |G|. 87 IESDS Algorithm – Complexity Introduce a new variable y to be maximized under the following constraints: si ∈Si xsi · ui(si , s−i) ≥ ui(si, s−i) + y s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 y ≥ 0 Now si is strictly dominated iff a solution maximizing y satisfies y > 0 The size of the above program is polynomial in |G|. So the step 2 of IESDS can be executed in polynomial time. 87 IESDS Algorithm – Complexity Introduce a new variable y to be maximized under the following constraints: si ∈Si xsi · ui(si , s−i) ≥ ui(si, s−i) + y s−i ∈ S−i xsi ≥ 0 si ∈ Si si ∈Si xsi = 1 y ≥ 0 Now si is strictly dominated iff a solution maximizing y satisfies y > 0 The size of the above program is polynomial in |G|. So the step 2 of IESDS can be executed in polynomial time. As every iteration of IESDS removes at least one pure strategy, IESDS runs in time polynomial in |G|. 87 IESDS in Mixed Strategie – Example X Y A 3 0 B 0 3 C 1 1 Let us have a look at the first iteration of IESDS. 88 IESDS in Mixed Strategie – Example X Y A 3 0 B 0 3 C 1 1 Let us have a look at the first iteration of IESDS. Observe that A, B are not strictly dominated by any mixed strategy. 88 IESDS in Mixed Strategie – Example X Y A 3 0 B 0 3 C 1 1 Let us have a look at the first iteration of IESDS. Observe that A, B are not strictly dominated by any mixed strategy. Let us construct the linear program for deciding whether C is strictly dominated: The program maximizes y under the following constraints: 3xA + 0xB + xC ≥ 1 + y Row’s payoff against X 0xA + 3xB + xC ≥ 1 + y Row’s payoff against Y xA , xB , xC ≥ 0 xA + xB + xC = 1 x’s must make a distribution y ≥ 0 88 IESDS in Mixed Strategie – Example X Y A 3 0 B 0 3 C 1 1 Let us have a look at the first iteration of IESDS. Observe that A, B are not strictly dominated by any mixed strategy. Let us construct the linear program for deciding whether C is strictly dominated: The program maximizes y under the following constraints: 3xA + 0xB + xC ≥ 1 + y Row’s payoff against X 0xA + 3xB + xC ≥ 1 + y Row’s payoff against Y xA , xB , xC ≥ 0 xA + xB + xC = 1 x’s must make a distribution y ≥ 0 The maximum y = 1 2 is attained at xA = 1 2 and xB = 1 2 . 88 Best Response Definition 33 A strategy σi ∈ Σi of player i is a best response to a strategy profile σ−i ∈ Σ−i of his opponents if ui(σi, σ−i) ≥ ui(σi , σ−i) for all σi ∈ Σi We denote by BRi(σ−i) ⊆ Σi the set of all best responses of player i to the strategy profile of opponents σ−i ∈ Σ−i. 89 Best Response – Example Consider a game with the following payoffs of player 1: X Y A 2 0 B 0 2 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)). Player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q). Compute BR1(q). 90 Rationalizability in Mixed Strategies (Two Players) For simplicity, we temporarily switch to two-player setting N = {1, 2}. 91 Rationalizability in Mixed Strategies (Two Players) For simplicity, we temporarily switch to two-player setting N = {1, 2}. Definition 34 A (mixed) belief of player i ∈ {1, 2} is a mixed strategy σ−i of his opponent. (A general definition works with so called correlated beliefs that are arbitrary distributions on S−i, the notion of the expected payoff needs to be adjusted, we are not going in this direction ....) 91 Rationalizability in Mixed Strategies (Two Players) For simplicity, we temporarily switch to two-player setting N = {1, 2}. Definition 34 A (mixed) belief of player i ∈ {1, 2} is a mixed strategy σ−i of his opponent. (A general definition works with so called correlated beliefs that are arbitrary distributions on S−i, the notion of the expected payoff needs to be adjusted, we are not going in this direction ....) Assumption: Any rational player with a belief σ−i always plays a best response to σ−i. 91 Rationalizability in Mixed Strategies (Two Players) For simplicity, we temporarily switch to two-player setting N = {1, 2}. Definition 34 A (mixed) belief of player i ∈ {1, 2} is a mixed strategy σ−i of his opponent. (A general definition works with so called correlated beliefs that are arbitrary distributions on S−i, the notion of the expected payoff needs to be adjusted, we are not going in this direction ....) Assumption: Any rational player with a belief σ−i always plays a best response to σ−i. Definition 35 A strategy σi ∈ Σi of player i ∈ {1, 2} is never best response if it is not a best response to any belief σ−i. No rational player plays a strategy that is never best response. 91 Rationalizability in Mixed Strategies (Two Players) Define a sequence R0 i , R1 i , R2 i , . . . of strategy sets of player i. (Denote by Gk Rat the game obtained from G by restricting the pure strategy sets to Rk i , i ∈ N.) 1. Initialize k = 0 and R0 i = Si for each i ∈ N. 2. For all players i ∈ N: Let Rk+1 i be the set of all strategies of Rk i that are best responses to some (mixed) beliefs in Gk Rat . 3. Let k := k + 1 and go to 2. We say that si ∈ Si is rationalizable if si ∈ Rk i for all k = 0, 1, 2, . . . Definition 36 A strategy profile s = (s1, . . . , sn) ∈ S is a rationalizable equilibrium if each si is rationalizable. 92 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)) player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) 93 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)) player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) What strategies of player 1 are never best responses? 93 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)) player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) What strategies of player 1 are never best responses? What strategies of player 1 are strictly dominated? 93 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)) player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) What strategies of player 1 are never best responses? What strategies of player 1 are strictly dominated? Observation: The set of strictly dominated strategies coincides with the set of never best responses! 93 Rationalizability vs IESDS (Two Players) X Y A 3 0 B 0 3 C 1 1 Player 1 (row) plays σ1 = (a(A), b(B), c(C)) player 2 (column) plays (q(X), (1 − q)(Y)) (we write just q) What strategies of player 1 are never best responses? What strategies of player 1 are strictly dominated? Observation: The set of strictly dominated strategies coincides with the set of never best responses! ... and this holds in general for two player games: Theorem 37 Assume N = {1, 2}. A pure strategy si is never best response to any belief σ−i ∈ Σ−i iff si is strictly dominated by a strategy σi ∈ Σi. It follows that a strategy of Si survives IESDS iff it is rationalizable. (The theorem is true also for an arbitrary number of players but correlated beliefs need to be used.) 93 Mixed Nash Equilibrium Definition 38 A mixed-strategy profile σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a (mixed) Nash equilibrium if σ∗ i is a best response to σ∗ −i for each i ∈ N, that is ui(σ∗ i , σ∗ −i) ≥ ui(σi, σ∗ −i) for all σi ∈ Σi and all i ∈ N An interpretation: each σ∗ −i can be seen as a belief of player i against which he plays a best response σ∗ i . 94 Mixed Nash Equilibrium Definition 38 A mixed-strategy profile σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a (mixed) Nash equilibrium if σ∗ i is a best response to σ∗ −i for each i ∈ N, that is ui(σ∗ i , σ∗ −i) ≥ ui(σi, σ∗ −i) for all σi ∈ Σi and all i ∈ N An interpretation: each σ∗ −i can be seen as a belief of player i against which he plays a best response σ∗ i . Given a mixed strategy profile of opponents σ−i ∈ Σ−i, we denote by BRi(σ−i) the set of all σi ∈ Σi that are best responses to σ−i. 94 Mixed Nash Equilibrium Definition 38 A mixed-strategy profile σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a (mixed) Nash equilibrium if σ∗ i is a best response to σ∗ −i for each i ∈ N, that is ui(σ∗ i , σ∗ −i) ≥ ui(σi, σ∗ −i) for all σi ∈ Σi and all i ∈ N An interpretation: each σ∗ −i can be seen as a belief of player i against which he plays a best response σ∗ i . Given a mixed strategy profile of opponents σ−i ∈ Σ−i, we denote by BRi(σ−i) the set of all σi ∈ Σi that are best responses to σ−i. Then σ∗ is a Nash equilibrium iff σ∗ i ∈ BRi(σ∗ −i ) for all i ∈ N. 94 Mixed Nash Equilibrium Definition 38 A mixed-strategy profile σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a (mixed) Nash equilibrium if σ∗ i is a best response to σ∗ −i for each i ∈ N, that is ui(σ∗ i , σ∗ −i) ≥ ui(σi, σ∗ −i) for all σi ∈ Σi and all i ∈ N An interpretation: each σ∗ −i can be seen as a belief of player i against which he plays a best response σ∗ i . Given a mixed strategy profile of opponents σ−i ∈ Σ−i, we denote by BRi(σ−i) the set of all σi ∈ Σi that are best responses to σ−i. Then σ∗ is a Nash equilibrium iff σ∗ i ∈ BRi(σ∗ −i ) for all i ∈ N. Theorem 39 (Nash 1950) Every finite game in strategic form has a Nash equilibrium. This is THE fundamental theorem of game theory. 94 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. 95 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. What are the expected payoffs of playing pure strategies for player 1? u1(H, q) = 2q − 1 and u1(T, q) = 1 − 2q 95 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. What are the expected payoffs of playing pure strategies for player 1? u1(H, q) = 2q − 1 and u1(T, q) = 1 − 2q Then u1(p, q) = pu1(H, q) + (1 − p)u1(T, q) = p(2q − 1) + (1 − p)(1 − 2q). 95 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. What are the expected payoffs of playing pure strategies for player 1? u1(H, q) = 2q − 1 and u1(T, q) = 1 − 2q Then u1(p, q) = pu1(H, q) + (1 − p)u1(T, q) = p(2q − 1) + (1 − p)(1 − 2q). We obtain the best-response correspondence BR1: BR1(q) =    p = 0 if q < 1 2 p ∈ [0, 1] if q = 1 2 p = 1 if q > 1 2 95 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Similarly for player 2 : u2(p, H) = 1 − 2p and u2(p, T) = 2p − 1 96 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Similarly for player 2 : u2(p, H) = 1 − 2p and u2(p, T) = 2p − 1 u2(p, q) = qu2(p, H) + (1 − q)u2(p, T) = q(1 − 2p) + (1 − q)(2p − 1) 96 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Similarly for player 2 : u2(p, H) = 1 − 2p and u2(p, T) = 2p − 1 u2(p, q) = qu2(p, H) + (1 − q)u2(p, T) = q(1 − 2p) + (1 − q)(2p − 1) We obtain best-response relation BR2: BR2(p) =    q = 1 if p < 1 2 q ∈ [0, 1] if p = 1 2 q = 0 if p > 1 2 96 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Similarly for player 2 : u2(p, H) = 1 − 2p and u2(p, T) = 2p − 1 u2(p, q) = qu2(p, H) + (1 − q)u2(p, T) = q(1 − 2p) + (1 − q)(2p − 1) We obtain best-response relation BR2: BR2(p) =    q = 1 if p < 1 2 q ∈ [0, 1] if p = 1 2 q = 0 if p > 1 2 The only "intersection" of BR1 and BR2 is the only Nash equilibrium σ1 = σ2 = (1 2 , 1 2 ). 96 Static Games of Complete Information Mixed Strategies Computing Nash Equilibria – Support Enumeration 97 Computing Mixed Nash Equilibria Lemma 40 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof. The fact that the right hand side implies ui(σ∗ ) = wi follows immediately from Lemma 23: ui(σ∗ ) = si ∈Si σ∗ (si)ui(si, σ∗ −i) = si ∈supp(σ∗ i ) σ∗ (si)ui(si, σ∗ −i) = si ∈supp(σ∗ i ) σ∗ (si)wi = wi si ∈supp(σ∗ i ) σ∗ (si) = wi 98 Computing Mixed Nash Equilibria Lemma 41 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof. (Cont.) "⇐": Use the first equality of Lemma 23 to obtain for every i ∈ N and every σi ∈ Σi ui(σi , σ∗ −i) = si ∈Si σi (si)ui(si, σ∗ −i) ≤ ≤ si ∈Si σi (si)wi = si ∈Si σi (si)ui(σ∗ ) = ui(σ∗ ) Thus σ∗ is a Nash equilibrium. 99 Computing Mixed Nash Equilibria Lemma 42 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof (Cont.) Idea for "⇒": Let wi := ui(σ∗ ). 100 Computing Mixed Nash Equilibria Lemma 42 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof (Cont.) Idea for "⇒": Let wi := ui(σ∗ ). Clearly, every i ∈ N and si ∈ Si satisfy ui(si, σ∗ −i ) ≤ ui(σ∗ ) = wi. 100 Computing Mixed Nash Equilibria Lemma 42 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof (Cont.) Idea for "⇒": Let wi := ui(σ∗ ). Clearly, every i ∈ N and si ∈ Si satisfy ui(si, σ∗ −i ) ≤ ui(σ∗ ) = wi. By Corollary 24, there is at least one si ∈ supp(σ∗ i ) satisfying ui(si, σ∗ −i ) = ui(σ∗ ) = wi. 100 Computing Mixed Nash Equilibria Lemma 42 σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Here, the right hand side implies ui(σ∗ ) = wi. Proof (Cont.) Idea for "⇒": Let wi := ui(σ∗ ). Clearly, every i ∈ N and si ∈ Si satisfy ui(si, σ∗ −i ) ≤ ui(σ∗ ) = wi. By Corollary 24, there is at least one si ∈ supp(σ∗ i ) satisfying ui(si, σ∗ −i ) = ui(σ∗ ) = wi. Now if there is si ∈ supp(σ∗ i ) such that ui(si , σ∗ −i) < ui(σ∗ ) (= ui(si, σ∗ −i)) then increasing the probability σ∗ i (si) and decreasing (in proportion) σ∗ i (si ) strictly increases of ui(σ∗ ), a contradiction with σ∗ being NE. 100 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. 101 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. There are no pure strategy equilibria. 101 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. There are no pure strategy equilibria. There are no equilibria where only player 1 randomizes: 101 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. There are no pure strategy equilibria. There are no equilibria where only player 1 randomizes: Indeed, assume that (p, H) is such an equilibrium. Then by Lemma 42, 1 = u1(H, H) = u1(T, H) = −1 a contradiction. Also, (p, T) cannot be an equilibrium. Similarly, there is no NE where only player 2 randomizes. 101 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Assume that both players randomize, i.e., p, q ∈ (0, 1). 102 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Assume that both players randomize, i.e., p, q ∈ (0, 1). The expected payoffs of playing pure strategies for player 1: u1(H, q) = 2q − 1 and u1(T, q) = 1 − 2q Similarly for player 2 : u2(p, H) = 1 − 2p and u1(p, T) = 2p − 1 102 Example: Matching Pennies H T H 1, −1 −1, 1 T −1, 1 1, −1 Player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q). Compute all Nash equilibria. Assume that both players randomize, i.e., p, q ∈ (0, 1). The expected payoffs of playing pure strategies for player 1: u1(H, q) = 2q − 1 and u1(T, q) = 1 − 2q Similarly for player 2 : u2(p, H) = 1 − 2p and u1(p, T) = 2p − 1 By Lemma 42, Nash equilibria must satisfy: 2q − 1 = 1 − 2q and 1 − 2p = 2p − 1 That is p = q = 1 2 is the only Nash equilibrium. 102 Example: Battle of Sexes O F O 2, 1 0, 0 F 0, 0 1, 2 Player 1 (row) plays (p(O), (1 − p)(F)) (we write just p) and player 2 (column) plays (q(O), (1 − q)(F)) (we write q). Compute all Nash equilibria. 103 Example: Battle of Sexes O F O 2, 1 0, 0 F 0, 0 1, 2 Player 1 (row) plays (p(O), (1 − p)(F)) (we write just p) and player 2 (column) plays (q(O), (1 − q)(F)) (we write q). Compute all Nash equilibria. There are two pure strategy equilibria (2, 1) and (1, 2), no Nash equilibrium where only one player randomizes. 103 Example: Battle of Sexes O F O 2, 1 0, 0 F 0, 0 1, 2 Player 1 (row) plays (p(O), (1 − p)(F)) (we write just p) and player 2 (column) plays (q(O), (1 − q)(F)) (we write q). Compute all Nash equilibria. There are two pure strategy equilibria (2, 1) and (1, 2), no Nash equilibrium where only one player randomizes. Now assume that player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q) where p, q ∈ (0, 1). 103 Example: Battle of Sexes O F O 2, 1 0, 0 F 0, 0 1, 2 Player 1 (row) plays (p(O), (1 − p)(F)) (we write just p) and player 2 (column) plays (q(O), (1 − q)(F)) (we write q). Compute all Nash equilibria. There are two pure strategy equilibria (2, 1) and (1, 2), no Nash equilibrium where only one player randomizes. Now assume that player 1 (row) plays (p(H), (1 − p)(T)) (we write just p) and player 2 (column) plays (q(H), (1 − q)(T)) (we write q) where p, q ∈ (0, 1). By Lemma 42, any Nash equilibrium must satisfy: 2q = 1 − q and p = 2(1 − p) This holds only for q = 1 3 and p = 2 3 . 103 An Algorithm? What did we do in the previous examples? 104 An Algorithm? What did we do in the previous examples? We went through all support combinations for both players. (pure, one player mixing, both mixing) 104 An Algorithm? What did we do in the previous examples? We went through all support combinations for both players. (pure, one player mixing, both mixing) For each pair of supports we tried to find equilibria in strategies with these supports. (in Battle of Sexes: two pure, no equilibrium with just one player mixing, one equilibrium when both mixing) 104 An Algorithm? What did we do in the previous examples? We went through all support combinations for both players. (pure, one player mixing, both mixing) For each pair of supports we tried to find equilibria in strategies with these supports. (in Battle of Sexes: two pure, no equilibrium with just one player mixing, one equilibrium when both mixing) Whenever one of the supports was non-singleton, we reduced computation of Nash equilibria to linear equations. 104 Support Enumeration (Idea) Recall Lemma 42: σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. 105 Support Enumeration (Idea) Recall Lemma 42: σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Suppose that we somehow know the supports supp(σ∗ 1 ), . . . , supp(σ∗ n) for some Nash equilibrium σ∗ 1 , . . . , σ∗ n (which itself is unknown to us). 105 Support Enumeration (Idea) Recall Lemma 42: σ∗ = (σ∗ 1 , . . . , σ∗ n) ∈ Σ is a Nash equilibrium iff there exist w1, . . . , wn ∈ R such that the following holds: For all i ∈ N and all si ∈ supp(σ∗ i ) we have ui(si, σ∗ −i ) = wi. For all i ∈ N and all si supp(σ∗ i ) we have ui(si, σ∗ −i ) ≤ wi. Suppose that we somehow know the supports supp(σ∗ 1 ), . . . , supp(σ∗ n) for some Nash equilibrium σ∗ 1 , . . . , σ∗ n (which itself is unknown to us). Now we may consider all σ∗ i (si)’s and all wi’s as variables and use the above conditions to design a system of inequalities capturing Nash equilibria with the given support sets supp(σ∗ 1 ), . . . , supp(σ∗ n). 105 Support Enumeration To simplify notation, assume that for every i we have Si = {1, . . . , mi}. Then σi(j) is the probability of the pure strategy j in the mixed strategy σi. 106 Support Enumeration To simplify notation, assume that for every i we have Si = {1, . . . , mi}. Then σi(j) is the probability of the pure strategy j in the mixed strategy σi. Fix supports suppi ⊆ Si for every i ∈ N and consider the following system of constraints with variables σ1(1), . . . , σ1(m1), . . . , σn(1), . . . , σn(mn), w1, . . . , wn: 106 Support Enumeration To simplify notation, assume that for every i we have Si = {1, . . . , mi}. Then σi(j) is the probability of the pure strategy j in the mixed strategy σi. Fix supports suppi ⊆ Si for every i ∈ N and consider the following system of constraints with variables σ1(1), . . . , σ1(m1), . . . , σn(1), . . . , σn(mn), w1, . . . , wn: 1. For all i ∈ N and all k ∈ suppi we have (ui(k, σ−i) = ) s∈S∧si =k   j i σj(sj)   ui(s) = wi 2. For all i ∈ N and all k suppi we have (ui(k, σ−i) = ) s∈S∧si =k   j i σj(sj)   ui(s) ≤ wi 3. For all i ∈ N: σi(1) + · · · + σi(mi) = 1. 4. For all i ∈ N and all k ∈ suppi: σi(k) ≥ 0. 5. For all i ∈ N and all k suppi: σi(k) = 0. 106 Support Enumeration Consider the system of constraints from the previous slide. The following lemma follows immediately from Lemma 42. Lemma 43 Let σ∗ ∈ Σ be a strategy profile. If σ∗ is a Nash equilibrium and supp(σ∗ i ) = suppi for all i ∈ N, then assigning σi(k) := σ∗ i (k) and wi := ui(σ∗ ) solves the system. If σi(k) := σ∗ i (k) and wi := ui(σ∗ ) solves the system, then σ∗ is a Nash equilibrium with supp(σ∗ i ) ⊆ suppi for all i ∈ N. 107 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! Input: A two-player strategic-form game G with strategy sets S1 = {1, . . . , m1} and S2 = {1, . . . , m2} and rational payoffs u1, u2. Output: A Nash equilibrium σ∗ . 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! Input: A two-player strategic-form game G with strategy sets S1 = {1, . . . , m1} and S2 = {1, . . . , m2} and rational payoffs u1, u2. Output: A Nash equilibrium σ∗ . Algorithm: For all possible supp1 ⊆ S1 and supp2 ⊆ S2: Check if the corresponding system of linear constraints (from the previous slide) has a feasible solution σ∗ , w∗ 1 , . . . , w∗ n. If so, STOP: the feasible solution σ∗ is a Nash equilibrium satisfying ui(σ∗ ) = w∗ i . 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! Input: A two-player strategic-form game G with strategy sets S1 = {1, . . . , m1} and S2 = {1, . . . , m2} and rational payoffs u1, u2. Output: A Nash equilibrium σ∗ . Algorithm: For all possible supp1 ⊆ S1 and supp2 ⊆ S2: Check if the corresponding system of linear constraints (from the previous slide) has a feasible solution σ∗ , w∗ 1 , . . . , w∗ n. If so, STOP: the feasible solution σ∗ is a Nash equilibrium satisfying ui(σ∗ ) = w∗ i . Question: How many possible subsets supp1, supp2 are there to try? 108 Support Enumeration (Two Players) The constraints are non-linear in general, but linear for two player games! Let us stick to two players. How to find supp1 and supp2? ... Just guess! Input: A two-player strategic-form game G with strategy sets S1 = {1, . . . , m1} and S2 = {1, . . . , m2} and rational payoffs u1, u2. Output: A Nash equilibrium σ∗ . Algorithm: For all possible supp1 ⊆ S1 and supp2 ⊆ S2: Check if the corresponding system of linear constraints (from the previous slide) has a feasible solution σ∗ , w∗ 1 , . . . , w∗ n. If so, STOP: the feasible solution σ∗ is a Nash equilibrium satisfying ui(σ∗ ) = w∗ i . Question: How many possible subsets supp1, supp2 are there to try? Answer: 2(m1+m2) So, unfortunately, the algorithm requires worst-case exponential time. 108 Remarks on Support Enumeration The algorithm combined with Theorem 39 and properties of linear programming imply that every finite two-player game has a rational Nash equilibrium (furthermore, the rational numbers have polynomial representation in binary). 109 Remarks on Support Enumeration The algorithm combined with Theorem 39 and properties of linear programming imply that every finite two-player game has a rational Nash equilibrium (furthermore, the rational numbers have polynomial representation in binary). The algorithm can be used to compute all Nash equilibria. (There are algorithms for computing (a finite representation of) a set of all feasible solutions of a given linear constraint system.) 109 Remarks on Support Enumeration The algorithm combined with Theorem 39 and properties of linear programming imply that every finite two-player game has a rational Nash equilibrium (furthermore, the rational numbers have polynomial representation in binary). The algorithm can be used to compute all Nash equilibria. (There are algorithms for computing (a finite representation of) a set of all feasible solutions of a given linear constraint system.) The algorithm can be used to compute "good" equilibria. 109 Remarks on Support Enumeration The algorithm combined with Theorem 39 and properties of linear programming imply that every finite two-player game has a rational Nash equilibrium (furthermore, the rational numbers have polynomial representation in binary). The algorithm can be used to compute all Nash equilibria. (There are algorithms for computing (a finite representation of) a set of all feasible solutions of a given linear constraint system.) The algorithm can be used to compute "good" equilibria. For example, to find a Nash equilibrium maximizing the sum of all expected payoffs (the "social welfare") it suffices to solve the system of constraints while maximizing w1 + · · · + wn. More precisely, the algorithm can be modified as follows: Initialize W := −∞ (W stores the current maximum welfare) For all possible supp1 ⊆ S1 and supp2 ⊆ S2: Find the maximum value max( wi) of w1 + · · · + wn so that the constraints are satisfiable (using linear programming). Put W := max{W, max( wi)}. Return W. 109 Remarks on Support Enumeration (Cont.) Similar trick works for any notion of "good" NE that can be expressed using a linear objective function and (additional) linear constraints in variables σi(j) and wi. (e.g., maximize payoff of player 1, minimize payoff of player 2 and keep probability of playing the strategy 1 below 1/2, etc.) 110 Complexity Results – (Two Players) Theorem 44 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... 111 Complexity Results – (Two Players) Theorem 44 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... Membership to NP follows from the support enumeration: For example, for 1., it suffices to guess supports supp1, supp2 and add w1 ≥ v to the constraints; the resulting NE σ∗ satisfies u1(σ∗ ) ≥ v. 111 Complexity Results (Two Players) Theorem 45 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... 112 Complexity Results (Two Players) Theorem 45 All the following problems are NP-complete: Given a two-player game in strategic form, does it have 1. a NE in which player 1 has utility at least a given amount v ? 2. a NE in which the sum of expected payoffs of the two players is at least a given amount v ? 3. a NE with a support of size greater than a given number? 4. a NE whose support contains a given strategy s ? 5. a NE whose support does not contain a given strategy s ? 6. .... NP-hardness can be proved using reduction from SAT (The reduction is not difficult but we are not going into it. It is presented in "New Complexity Results about Nash Equilibria" by V. Conitzer and T. Sandholm (pages 6–8) ) 112 The Reduction (It’s Short and Sweet) 113 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). 114 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. 114 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. We use complexity classes of function problems such as FP, FNP, etc. 114 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. We use complexity classes of function problems such as FP, FNP, etc. The support enumeration gives a deterministic algorithm which runs in exponential time. Can we do better? 114 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. We use complexity classes of function problems such as FP, FNP, etc. The support enumeration gives a deterministic algorithm which runs in exponential time. Can we do better? In what follows we show that the sample equilibrium problem can be solved in polynomial time for zero-sum two-player games, (Using a beautiful characterization of all Nash equilibria) 114 ... But What is The Exact Complexity of Computing Nash Equilibria in Two Player Games? Let us concentrate on the problem of computing one Nash equilibrium (sometimes called the sample equilibrium problem). As the class NP consists of decision problems, it cannot be directly used to characterize complexity of the sample equilibrium problem. We use complexity classes of function problems such as FP, FNP, etc. The support enumeration gives a deterministic algorithm which runs in exponential time. Can we do better? In what follows we show that the sample equilibrium problem can be solved in polynomial time for zero-sum two-player games, (Using a beautiful characterization of all Nash equilibria) the sample equilibrium problem belongs to the complexity class PPAD (which is a subclass of FNP) for two-player games. (... to be defined later) 114 MaxMin Is there a better characterization of Nash equilibria than Lemma 42 ? 115 MaxMin Is there a better characterization of Nash equilibria than Lemma 42 ? Definition 46 σ∗ i ∈ Σi is a maxmin strategy of player i if σ∗ i ∈ argmax σi ∈Σi min σ−i ∈Σ−i ui(σi, σ−i) (Intuitively, a maxmin strategy σ∗ 1 maximizes player 1’s worst-case payoff in the situation where player 2 strives to cause the greatest harm to player 1.) (Since ui is continuous and Σ−i compact, minσ−i ∈Σ−i ui(σi, σ−i) is well defined and continuous on Σi, which implies that there is at least one maxmin strategy.) 115 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) 116 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Proof. By Corollary 24, for every σ ∈ Σ we have ui(σi, σ−i) ≥ ui(σi, s−i) for some s−i ∈ S−i. 116 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Proof. By Corollary 24, for every σ ∈ Σ we have ui(σi, σ−i) ≥ ui(σi, s−i) for some s−i ∈ S−i. Thus minσ−i ∈Σ−i ui(σi, σ−i) = mins−i ∈S−i ui(σi, s−i). 116 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Proof. By Corollary 24, for every σ ∈ Σ we have ui(σi, σ−i) ≥ ui(σi, s−i) for some s−i ∈ S−i. Thus minσ−i ∈Σ−i ui(σi, σ−i) = mins−i ∈S−i ui(σi, s−i). Hence, argmax σi ∈Σi min σ−i ∈Σ−i ui(σi, σ−i) = argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) 116 MaxMin Lemma 47 σ∗ i is maxmin iff σ∗ i ∈ argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Proof. By Corollary 24, for every σ ∈ Σ we have ui(σi, σ−i) ≥ ui(σi, s−i) for some s−i ∈ S−i. Thus minσ−i ∈Σ−i ui(σi, σ−i) = mins−i ∈S−i ui(σi, s−i). Hence, argmax σi ∈Σi min σ−i ∈Σ−i ui(σi, σ−i) = argmax σi ∈Σi min s−i ∈S−i ui(σi, s−i) Question: Assume a strategy profile where both players play their maxmin strategies? Does it have to be a Nash equilibrium? 116 Zero-Sum Games: von Neumann’s Theorem Assume that G is zero sum, i.e., u1 = −u2. Then σ∗ 2 ∈ Σ2 is maxmin of player 2 iff σ∗ 2 ∈ argmin σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) (= argmin σ2∈Σ2 max s1∈S1 u1(s1, σ2)) (Intuitively, maxmin of player 2 minimizes the payoff of player 1 when player 1 plays his best responses. Such strategy of player 2 is often called minmax.) 117 Zero-Sum Games: von Neumann’s Theorem Assume that G is zero sum, i.e., u1 = −u2. Then σ∗ 2 ∈ Σ2 is maxmin of player 2 iff σ∗ 2 ∈ argmin σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) (= argmin σ2∈Σ2 max s1∈S1 u1(s1, σ2)) (Intuitively, maxmin of player 2 minimizes the payoff of player 1 when player 1 plays his best responses. Such strategy of player 2 is often called minmax.) Theorem 48 (von Neumann) Assume a two-player zero-sum game. Then max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) = min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) Morever, σ∗ = (σ∗ 1 , σ∗ 2 ) ∈ Σ is a Nash equilibrium iff both σ∗ 1 and σ∗ 2 are maxmin. 117 Zero-Sum Games: von Neumann’s Theorem Assume that G is zero sum, i.e., u1 = −u2. Then σ∗ 2 ∈ Σ2 is maxmin of player 2 iff σ∗ 2 ∈ argmin σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) (= argmin σ2∈Σ2 max s1∈S1 u1(s1, σ2)) (Intuitively, maxmin of player 2 minimizes the payoff of player 1 when player 1 plays his best responses. Such strategy of player 2 is often called minmax.) Theorem 48 (von Neumann) Assume a two-player zero-sum game. Then max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) = min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) Morever, σ∗ = (σ∗ 1 , σ∗ 2 ) ∈ Σ is a Nash equilibrium iff both σ∗ 1 and σ∗ 2 are maxmin. So to compute a Nash equilibrium it suffices to compute (arbitrary) maxmin strategies for both players. 117 Proof of Theorem 48 (Homework) Homework: Prove von Neumann’s Theorem in 4 easy steps: 1. Prove this inequality: max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) ≤ min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) 2. Prove that (σ∗ 1 , σ∗ 2 ) is a Nash equilibrium iff min σ2∈Σ2 u1(σ∗ 1, σ2) ≥ u1(σ∗ 1, σ∗ 2) ≥ max σ1∈Σ1 u1(σ1, σ∗ 2) Hint: One of the inequalities is trivial and the other one almost. 3. Use 1. and 2. together with Theorem 39 to prove max σ1∈Σ1 min σ2∈Σ2 u1(σ1, σ2) ≥ min σ2∈Σ2 max σ1∈Σ1 u1(σ1, σ2) 4. Use the above to prove the rest of the theorem. Hint: Use the characterization of NE from 2., do not forget that you already have maxσ1∈Σ1 minσ2∈Σ2 u1(σ1, σ2) = minσ2∈Σ2 maxσ1∈Σ1 u1(σ1, σ2) You may already have proved one of the implications when proving 3. 118 Zero-Sum Two-Player Games – Computing NE Assume S1 = {1, . . . , m1} and S2 = {1, . . . , m2}. 119 Zero-Sum Two-Player Games – Computing NE Assume S1 = {1, . . . , m1} and S2 = {1, . . . , m2}. We want to compute σ∗ 1 ∈ argmax σ1∈Σ1 min ∈S2 u1(σ1, ) 119 Zero-Sum Two-Player Games – Computing NE Assume S1 = {1, . . . , m1} and S2 = {1, . . . , m2}. We want to compute σ∗ 1 ∈ argmax σ1∈Σ1 min ∈S2 u1(σ1, ) Consider a linear program with variables σ1(1), . . . , σ1(m1), v: maximize: v subject to: m1 k=1 σ1(k) · u1(k, ) ≥ v = 1, . . . , m2 m1 k=1 σ1(k) = 1 σ1(k) ≥ 0 k = 1, . . . , m1 119 Zero-Sum Two-Player Games – Computing NE Assume S1 = {1, . . . , m1} and S2 = {1, . . . , m2}. We want to compute σ∗ 1 ∈ argmax σ1∈Σ1 min ∈S2 u1(σ1, ) Consider a linear program with variables σ1(1), . . . , σ1(m1), v: maximize: v subject to: m1 k=1 σ1(k) · u1(k, ) ≥ v = 1, . . . , m2 m1 k=1 σ1(k) = 1 σ1(k) ≥ 0 k = 1, . . . , m1 Lemma 49 σ∗ 1 ∈ argmaxσ1∈Σ1 min ∈S2 u1(σ1, ) iff assigning σ1(k) := σ∗ 1 (k) and v := min ∈S2 u1(σ∗ 1 , ) gives an optimal solution. 119 Zero-Sum Two-Player Games – Computing NE Summary: We have reduced computation of NE to computation of maxmin strategies for both players. Maxmin strategies can be computed using linear programming in polynomial time. That is, Nash equilibria in zero-sum two-player games can be computed in polynomial time. 120