IA168 Algorithmic Game Theory Tomáš Brázdil 1 Organization of This Course Sources: Lectures (slides, notes) based on several sources slides are prepared for lectures, some stuff on greenboard (⇒ attend the lectures) Books: Nisan/Roughgarden/Tardos/Vazirani, Algorithmic Game Theory, Cambridge University, 2007. Available online for free: http://www.cambridge.org/journals/nisan/downloads/Nisan_Non-printable.pdf Tadelis, Game Theory: An Introduction, Princeton University Press, 2013 (I use various resources, so please, attend the lectures) 2 Evaluation Oral exam Homework (occasionally) 3 What is Algorithmic Game Theory? First, what is the game theory? According to the Oxford dictionary it is "the branch of mathematics concerned with the analysis of strategies for dealing with competitive situations where the outcome of a participant’s choice of action depends critically on the actions of other participants" According to Myerson it is "the study of mathematical models of conflict and cooperation between intelligent rational decision-makers" What does the "algorithmic" mean? It means that we are "concerned with the computational questions that arise in game theory, and that enlighten game theory. In particular, questions about finding efficient algorithms to ‘solve’ games.” Let’s have a look at some examples .... 4 Prisoner’s Dilemma Two suspects of a serious crime are arrested and imprisoned. Police has enough evidence of only petty theft, and to nail the suspects for the serious crime they need testimony from at least one of them. The suspects are interrogated separately without any possibility of communication. Each of the suspects is offered a deal: If he confesses (C) to the crime, he is free to go. The alternative is not to confess, that is remain silent (S). Sentence depends on the behavior of both suspects. The problem: What would the suspects do? 5 Prisoner’s Dilemma – Solution(?) C S C −5, −5 0, −20 S −20, 0 −1, −1 Rational "row" suspect (or his adviser) may reason as follows: If my colleague chooses C, then playing C gives me −5 and playing S gives −20. If my colleague chooses S, then playing C gives me 0 and playing S gives −1. In both cases C is clearly better (it strictly dominates the other strategy). If the other suspect’s reasoning is the same, both choose C and get 5 years sentence. Where is the dilemma? There is a solution (S, S) which is better for both players but needs some “central” authority to control the players. Are there always “dominant” strategies? 6 Nash equilibria – Battle of Sexes A couple agreed to meet this evening, but cannot recall if they will be attending the opera or a football match. The husband would like to go to the football game. The wife would like to go to the opera. Both would prefer to go to the same place rather than different ones. If they cannot communicate, where should they go? 7 Nash equilibria – Battle of Sexes Battle of Sexes can be modeled as a game of two players (Wife, Husband) with the following payoffs: O F O 2, 1 0, 0 F 0, 0 1, 2 Apparently, no strategy of any player is dominant. A “solution”? Note that whenever both players play O, then neither of them wants to unilaterally deviate from his strategy! (O, O) is an example of a Nash equilibrium (as is (F, F)) 8 Mixed Equilibria – Rock-Paper-Scissors R P S R 0, 0 −1, 1 1, −1 P 1, −1 0, 0 −1, 1 S −1, 1 1, −1 0, 0 This is an example of zero-sum games: whatever one of the players wins, the other one looses. What is an optimal behavior here? Is there a Nash equilibrium? Use mixed strategies: Each player plays each pure strategy with probability 1/3. The expected payoff of each player is 0 (even if one of the players changes his strategy, he still gets 0!). How to algorithmically solve games in mixed strategies? (we shall use probability theory and linear programming) 9 Philosophical Issues in Games 10 Games of Incomplete Information In all previous games the players knew all details of the game they played, and this fact was a “common knowledge”. This is not always the case. Example: Sealed Bid Auction Two bidders are trying to purchase the same item. The bidders simultaneously submit bids b1 and b2 and the item is sold to the highest bidder at his bid price (first price auction) The payoff of the player 1 (and similarly for player 2) is calculated by u1(b1, b2) =    v1 − b1 b1 > b2 1 2 (v1 − b1) b1 = b2 0 b1 < b2 Here v1 is the private value that player 1 assigns to the item and so the player 2 does not know u1. How to deal with such a game? Assume the “worst” private value? What if we have a partial knowledge about the private values? 11 Mechanism Design Suppose you are the game designer. How would you design the game so that the “solutions” will satisfy certain “global objectives” ? Examples: Sealed Bid Auctions: How would you design auction rules so that for every bidder, bidding the private value will be a dominant strategy? How would you design protocols (such as network protocols), to encourage “cooperation” (e.g., diminish congestion)? This is an extremely hot topic of current research! 12 Inefficiency of Equilibria In Prisoner’s Dilemma, the selfish behavior of suspects (the Nash equilibrium) results in somewhat worse than ideal situation. C S C −5, −5 0, −20 S −20, 0 −1, −1 Defining a welfare function W which to every pair of strategies assigns the sum of payoffs, we get W(C, C) = −10 but W(S, S) = −2. The ratio W(C,C) W(S,S) = 5 measures the inefficiency of "selfish-behavior" (C, C) w.r.t. the optimal “centralized” solution. Price of Anarchy is the maximum ratio between values of equilibria and the value of an optimal solution. 13 Inefficiency of Equilibria – Selfish Routing Consider a transportation system where many agents are trying to get from some initial location to a destination. Consider the welfare to be the average time for an agent to reach the destination. There are two versions: “Centralized”: A central authority tells each agent where to go. “Decentralized”: Each agent selfishly minimizes his travel time. Price of Anarchy measure the ratio between average travel time in these two cases. Problem: Bound the price of anarchy over all routing games? 14 Dynamic Games So far we have seen games in strategic form that are unable to capture games that unfold over time (such as chess). For such purpose we need to use extensive form games: P1 P2 (1, 2) C (1, −1) D (0, 2) E A P2 (2, 2) F (1, 3) G B How to "solve" such games? What is their relationship to the strategic form games? 15 Chance and Imperfect Information Some decisions in the game tree may be by chance and controlled by neither player (e.g. Poker, Backgammon, etc.) Sometimes a player may not be able to distinguish between several “positions” because he does not know all the information in them (Think a card game with opponent’s cards hidden). F G D 1 2 F G E1 2 A H I J B P1 P1 Nature P2 (a, b) (c, d) (e, f) (g, h) (i, j) (k, ) (m, n) Again, how to solve such games? 16 Games in Computer Science Game theory is a core foundation of mathematical economics. But what does it have to do with CS? Games in AI: modeling of “rational” agents and their interactions. Games in Algorithms: several game theoretic problems have a very interesting algorithmic status and are solved by interesting algorithms Games in modeling and analysis of reactive systems: program inputs viewed “adversarially”, bisimulation games, etc. Games in computational complexity: Many complexity classes are definable in terms of games: PSPACE, polynomial hierarchy, etc. Games in Logic: modal and temporal logics, Ehrenfeucht-Fraisse games, etc. 17 Games in Computer Science Games, the Internet and E-commerce: An extremely active research area at the intersection of CS and Economics Basic idea: “The internet is a HUGE experiment in interaction between agents (both human and automated)” How do we set up the rules of this game to harness “socially optimal” results? 18 Summary and Brief Overview This is a theoretical course aimed at some fundamental results of game theory, often related to computer science We start with strategic form games (such as the Prisoner’s dilemma), investigate several solution concepts (dominance, equilibria) and related algorithms (in particular, Lemke-Howson algorithm for computing Nash Eq.) Subsequently, we move on to incomplete information games, auctions, and mechanism design Then consider (in)efficiency of equilibria (such as the Price of Anarchy) and its properties on important classes of routing and network formation games. Then we consider repeated games which allow players to learn from history and/or to react to deviations of the other players. Remaining time will be devoted to selected topics from extensive form games, games on graphs etc. 19 Static Games of Complete Information Strategic-Form Games Solution concepts 20 Static Games of Complete Information – Intuition Proceed in two steps: 1. Each player simultaneously and independently chooses a strategy. This means that players play without observing strategies chosen by other players. 2. Conditional on the players’ strategies, payoffs are distributed to all players. Complete information means that the following is common knowledge among players: all possible strategies of all players, what payoff is assigned to each combination of strategies. Definition 1 A fact E is a common knowledge among players {1, . . . , n} if for every sequence i1, . . . , ik ∈ {1, . . . , n} we have that i1 knows that i2 knows that ... ik−1 knows that ik knows E. The goal of each player is to maximize his payoff (and this fact is common knowledge). 21 Strategic-Form Games To formally represent static games of complete information we define strategic-form games. Definition 2 A game in strategic-form (or normal-form) is an ordered triple G = (N, (Si)i∈N , (ui)i∈N), in which: N = {1, 2, . . . , n} is a finite set of players. Si is a set of (pure) strategies of player i, for every i ∈ N. A strategy profile is a vector of strategies of all players (s1, . . . , sn) ∈ S1 × · · · × Sn. We denote the set of all strategy profiles by S = S1 × · · · × Sn. ui : S → R is a function associating each strategy profile s = (s1, . . . , sn) ∈ S with the payoff ui(s) to player i, for every player i ∈ N. Definition 3 A zero-sum game G is one in which for all s = (s1, . . . , sn) ∈ S we have u1(s) + u2(s) + · · · + un(s) = 0. 22 Example: Prisoner’s Dilemma N = {1, 2} S1 = S2 = {S, C} u1, u2 are defined as follows: u1(C, C) = −5, u1(C, S) = 0, u1(S, C) = −20, u1(S, S) = −1 u2(C, C) = −5, u2(C, S) = −20, u2(S, C) = 0, u2(S, S) = −1 (Is it zero sum?) We usually write payoffs in the following form: C S C −5, −5 0, −20 S −20, 0 −1, −1 or as two matrices: C S C −5 0 S −20 −1 C S C −5 −20 S 0 −1 23 Example: Cournot Duopoly Two identical firms, players 1 and 2, produce some good. Denote by q1 and q2 quantities produced by firms 1 and 2, resp. The total quantity of products in the market is q1 + q2. The price of each item is κ − q1 − q2 (here κ is a positive constant) Firms 1 and 2 have per item production costs c1 and c2, resp. Question: How these firms are going to behave? We may model the situation using a strategic-form game. Strategic-form game model (N, (Si)i∈N , (ui)i∈N) N = {1, 2} Si = [0, ∞) u1(q1, q2) = q1(κ − q1 − q2) − q1c1 u2(q1, q2) = q2(κ − q1 − q2) − q2c2 24 Solution Concepts A solution concept is a method of analyzing games with the objective of restricting the set of all possible outcomes to those that are more reasonable than others. We will use term equilibrium for any one of the strategy profiles that emerges as one of the solution concepts’ predictions. (I follow the approach of Steven Tadelis here, it is not completely standard) Example 4 Nash equilibrium is a solution concept. That is, we “solve” games by finding Nash equilibria and declare them to be reasonable outcomes. 25 Assumptions Throughout the lecture we assume that: 1. Players are rational: a rational player is one who chooses his strategy to maximize his payoff. 2. Players are intelligent: An intelligent player knows everything about the game (actions and payoffs) and can make any inferences about the situation that we can make 3. Common knowledge: The fact that players are rational and intelligent is a common knowledge among them. 4. Self-enforcement: Any prediction (or equilibrium) of a solution concept must be self-enforcing. Here 4. implies non-cooperative game theory: Each player is in control of his actions, and he will stick to an action only if he finds it to be in his best interest. 26 Evaluating Solution Concepts In order to evaluate our theory as a methodological tool we use the following criteria: 1. Existence (i.e. How often does it apply?): Solution concept should apply to a wide variety of games. E.g. We prove that mixed Nash equilibria exist in all two player finite strategic-form games. 2. Uniqueness (How much does it restrict behavior?): We demand our solution concept to restrict the behavior as much as possible. E.g. So called strictly dominant strategy equilibria are always unique as opposed to Nash eq. The basic notion for evaluating "social outcome" is the following Definition 5 A strategy profile s ∈ S Pareto dominates a strategy profile s ∈ S if ui(s) ≥ ui(s ) for all i ∈ N, and ui(s) > ui(s ) for at least one i ∈ N. A strategy profile s ∈ S is Pareto optimal if it is not Pareto dominated by any other strategy profile. We will see more measures of social outcome later. 27 Solution Concepts – Pure Strategies We will consider the following solution concepts: strict dominant strategy equilibrium iterated elimination of strictly dominated strategies (IESDS) rationalizability Nash equilibria For now, let us concentrate on pure strategies only! I.e., no mixed strategies are allowed. We will generalize to mixed setting later. 28 Notation Let N = {1, . . . , n} be a finite set and for each i ∈ N let Xi be a set. Let X := i∈N Xi = {(x1, . . . , xn) | xj ∈ Xj, j ∈ N}. For i ∈ N we define X−i := j i Xj, i.e., X−i = {(x1, . . . , xi−1, xi+1, . . . , xn) | xj ∈ Xj, ∀j i} An element of X−i will be denoted by x−i = (x1, . . . , xi−1, xi+1, . . . , xn) We slightly abuse notation and write (xi, x−i) to denote (x1, . . . , xi, . . . , xn) ∈ X. 29 Strict Dominance in Pure Strategies Definition 6 Let si, si ∈ Si be strategies of player i. Then si is strictly dominated by si (write si si ) if for any possible combination of the other players’ strategies, s−i ∈ S−i, we have ui(si, s−i) > ui(si , s−i) for all s−i ∈ S−i Claim 1 An intelligent and rational player will never play a strictly dominated strategy. Clearly, intelligence implies that the player should recognize dominated strategies, rationality implies that the player will avoid playing them. 30 Strictly Dominant Strategy Equilibrium in Pure Str. Definition 7 si ∈ Si is strictly dominant if every other pure strategy of player i is strictly dominated by si. Observe that every player has at most one strictly dominant strategy, and that strictly dominant strategies do not have to exist. Claim 2 Any rational player will play the strictly dominant strategy (if it exists). Definition 8 A strategy profile s ∈ S is a strictly dominant strategy equilibrium if si ∈ Si is strictly dominant for all i ∈ N. Corollary 9 If the strictly dominant strategy equilibrium exists, it is unique and rational players will play it. Is the strictly dominant strategy equilibrium always Pareto optimal? 31 Examples In the Prisoner’s dilemma: C S C −5, −5 0, −20 S −20, 0 −1, −1 (C, C) is the strictly dominant strategy equilibrium (the only profile that is not Pareto optimal!). In the Battle of Sexes: O F O 2, 1 0, 0 F 0, 0 1, 2 no strictly dominant strategies exist. 32 Indiana Jones and the Last Crusade (Taken from Dixit & Nalebuff’s "The Art of Strategy" and a lecture of Robert Marks) Indiana Jones, his father, and the Nazis have all converged at the site of the Holy Grail. The two Joneses refuse to help the Nazis reach the last step. So the Nazis shoot Indiana’s dad. Only the healing power of the Holy Grail can save the senior Dr. Jones from his mortal wound. Suitably motivated, Indiana leads the way to the Holy Grail. But there is one final challenge. He must choose between literally scores of chalices, only one of which is the cup of Christ. While the right cup brings eternal life, the wrong choice is fatal. The Nazi leader impatiently chooses a beautiful gold chalice, drinks the holy water, and dies from the sudden death that follows from the wrong choice. Indiana picks a wooden chalice, the cup of a carpenter. Exclaiming "There’s only one way to find out" he dips the chalice into the font and drinks what he hopes is the cup of life. Upon discovering that he has chosen wisely, Indiana brings the cup to his father and the water heals the mortal wound. 33 Indiana Jones and the Last Crusade (cont.) Indy Goofed Although this scene adds excitement, it is somewhat embarrassing that such a distinguished professor as Dr. Indiana Jones would overlook his dominant strategy. He should have given the water to his father without testing it first. If Indiana has chosen the right cup, his father is still saved. If Indiana has chosen the wrong cup, then his father dies but Indiana is spared. Testing the cup before giving it to his father doesn’t help, since if Indiana has made the wrong choice, there is no second chance – Indiana dies from the water and his father dies from the wound. 34