Selfish Routing Congestion Games Potential Games 332 Selfish Routing – Motivation Many agents want to use shared resources Each of them is selfish and rational (i.e. maximizes his profit) Examples: Users of a computer network, drivers on roads How they are going to behave? How much is lost by letting agents behave selfishly on their own? 333 Example: Routing in Computer Networks Imagine a computer network, i.e., computers connected by links. There are several users, each user wants to route packets from a source computer zi to a target computer ti. For this, each user i needs to choose a path in the network from zi to ti. We assume that the more agents try to route their messages through the same link, the more the link gets congested and the more costly the transmission is. Now assume that the users are selfish and try to minimize their cost (typically transmission time). How would they behave? 334 Atomic Routing Games The network routing can be formalized using an atomic routing game that consists of � a directed multi-graph G = (V, E, δ), Here V is a set of vertices, E is a set of edges, δ : E → V × V so that if δ(e) = (u, v) then e leads from u to v. The multigraph G models the network. � n pairs of source-target vertices (z1, t1), . . . , (zn, tn) where z1, . . . , zn, t1, . . . , tn ∈ V, (Each pair (zi, ti) corresponds to a user who wants to route from zi to ti) � for each e ∈ E a cost function ce : N → R such that ce(m) is the cost of routing through the link e if the amount of traffic through e is m. Each user i chooses a simple path from zi to ti and pays the sum of the costs of the links on the path. An atomic routing game is symmetric if z1 = · · · = zn and t1 = · · · = tn. 335 Atomic Routing Games Here we assume at most three users. Each edge is labeled by the cost if one, two, or all three users route through the edge, respectively. Here we consider a symmetric case with three users, each has the source z and target t. 336 Atomic Routing Games Here, e.g., the red user pays 3 + 2 = 5 : � 3 for the first step from z (he shares the edge with the blue one) � 2 for the second step to t (he is the only user of the edge) Atomic routing games are usually studied as a special case of so called (atomic) congestion games. 337 Congestion Games A congestion game is a tuple G = (N, R, (Si)i∈N , (cr )r∈R) where � N = {1, . . . , n} is a set of players, � R is a set of resources, � each Si ⊆ 2R � {∅} is a set of pure strategies for player i, � each cr : N → R is a cost function for a resource r ∈ R. Notation: S = S1 × · · · × Sn and c = (c1, . . . , cn). Intuition: � Each player allocates a set of resources by playing a pure strategy si ⊆ R. � Then each player "pays" for every allocated resource r ∈ si based on cr and the number of other players who demand the same resource r : � If � players use the resource r, then each of them pays cr (�) for this particular resource r. 338 Congestion Games: Payoffs and Nash Equilibria Let # : R × S → N be a function defined for r ∈ R and s = (s1, . . . , sn) ∈ S by #(r, s) = | {i ∈ N | r ∈ si} |. I.e., #(r, s) is the number of players using the resource r in the strategy profile s. We define the payoff for player i by ui(s) = − � r∈si cr (#(r, s)) (33) Intuitively, the more congested a resource r ∈ si is, the more player i has to pay for it. Definition 94 Nash equilibria are defined as usual, a pure strategy profile (s1, . . . , sn) ∈ S is a Nash equilibrium if for every player i and every s� i ∈ Si we have ui(si, s−i) ≥ ui(s� i , s−i). 339 Atomic Routing Games and Congestion Games Given an atomic routing game we may model it as a congestion game (N, R, (Si)i∈N , (cr )r∈R) : � Players N = {1, . . . , n} correspond to the pairs of source-target vertices (z1, t1), . . . , (zn, tn), � resources are edges in the multigraph G, i.e, R = E, � the set of pure strategies Si of player i consists of all simple paths (i.e., sets of edges) in the multigraph G from his source zi to his target ti, � the cost function ce of each edge e ∈ E has to be determined according to the properties of the network. Often (but not always) a linear (affine) function ce(x) = aex + be is used (here x is the number of players using the edge e). Now each Nash equilibrium in G corresponds to a stable situation where no user wants to change his behavior. 340 Solving Congestion Games We consider the following questions: � Are there pure strategy Nash equilibria? � Can the agents "learn" to use the network? � How difficult is to compute an equilibrium? 341 Learning: Myopic Best-Response Given a pure strategy profile s = (s1, . . . , sn), suppose that some player i has an alternative strategy s� i such that ui(s� i , s−i) > ui(si, s−i). Player i can switch (unilaterally) from si to s� i improving thus his payoff. Iterating such improvement steps, we obtain the following: Myopic best response procedure: � Start with an arbitrary pure strategy profile s = (s1, . . . , sn). � While there exists a player i for whom si is not a best response to s−i do � s� i := a best-response by player i to s−i � s := (s� i , s−i) � return s By definition, if the myopic best response terminates, the resulting strategy profile s is a Nash equilibrium. It may not terminate in general (see the green board). Theorem 95 For every congestion game, the myopic best response terminates in a Nash equilibrium for an arbitrary starting pure strategy profile. 342 Potential Games We prove Theorem 95 by reduction to the following potential games. Definition 96 A strategic form game G = (N, (Si)i∈N , (ui)i∈N) is a potential game if there exists a function P : S1 × · · · × Sn → R such that for all i ∈ N, all s−i ∈ S−i and all si, s� i ∈ Si we have that ui(si, s−i) − ui(s� i , s−i) = P(si, s−i) − P(s� i , s−i) Theorem 97 For every finite potential game, the myopic best-response terminates in a Nash equilibrium for an arbitrary starting pure strategy profile. Proof. Note that every iteration of the myopic best-response procedure strictly increases ui(s) for some i, which in effect strictly increases P(s) by the same amount. As there are only finitely many strategy profiles, the procedure must terminate. The resulting profile is clearly a Nash equilibrium. � 343 Congestion Games as Potential Games Theorem 98 Let G = (N, R, (Si)i∈N , (cr )r∈R ) be a congestion game and for each i ∈ N, let ui be the payoff of player i in G defined by the equation (33). Then (N, (Si)i∈N , (ui)i∈N) is a potential game. Recall that ui(s) = − � r∈si cr (#(r, s)) where #(r, s) is the number of players using the resource r in the strategy profile s. Note that we obtain Theorem 95 as a corollary of Theorem 98 and Theorem 97. Proof of Theorem 98. Given s ∈ S = S1 × · · · × Sn, define P(s) = − � r∈R #(r,s)� j=1 cr (j) We show that P is a potential function, i.e., prove that for any two strategy profiles (si, s−i) and (s� i , s−i) we have P(si, s−i) − P(s� i , s−i) = ui(si, s−i) − ui(s� i , s−i) 344 Illustration of the potential Intuitively, the potential corresponds to the total cost paid by players when they choose their strategies one after the other. Consider two players: � First, player 1 chooses a strategy s1 and pays � r∈s1 cr (1) � Then, player 2 chooses a strategy s2 and pays � r∈s2�s1 cr (1) + � r∈s2∩s1 cr (2) Summing we get � r∈s1 cr (1) + � r∈s2�s1 cr (1) + � r∈s2∩s1 cr (2) = � r∈s1�s2 cr (1) + � r∈s2∩s1 cr (1) + � r∈s2�s1 cr (1) + � r∈s2∩s1 cr (2) = � r∈s1�s2 cr (1) + � r∈s2�s1 cr (1) + � r∈s2∩s1 cr (1) + cr (2) = � r∈R #(r,(s1,s2))� j=1 cr (j) 345 Illustration of Potential Let us compute the potential P. 346 Illustration of Potential First, add the red player ... 347 Illustration of Potential The red player pays 2 + 2 = 4. Second, add the green player ... 348 Illustration of Potential The green player pays 2 + 4 = 6. Third, add the blue player ... 349 Illustration of Potential The blue player pays 3 + 1 + 6 = 10. In total, they pay 4 + 6 + 10 = 20. We get the same number by using the expression for P : (2 + 3) + 2 + 1 + 2 + (4 + 6) = 20 The potential is thus P = −20. 350 Illustration of Potential ⇒ The blue player changes his strategy. What is the change in the potential? Recall that on the left hand side, the blue player paid 10 which gave the potential −20. Now he pays 3 + 3 = 6 on the right hand side. So the potential on the right hand side is −16. The difference between potentials is −20 − (−16) = −4. The difference between payoffs for the blue player is −10 − (−6) = −4. (the right hand side is cheaper and thus better for the blue player) 351 Illustration of Potential ⇒ The crucial observation is that we may consider players coming in an arbitrary order. In particular, to prove P(si, s−i) − P(s� i , s−i) = ui(si, s−i) − ui(s� i , s−i) we may assume that player i came last. 352 Proof of Theorem 98 (Cont.) Let (si, s−i) and (s� i , s−i) be two strategy profiles. Recall that we need to prove P(si, s−i) − P(s� i , s−i) = ui(si, s−i) − ui(s� i , s−i) By definition, P(si, s−i) − P(s� i , s−i) =   � r∈R #(r,(s� i ,s−i )) � j=1 cr (j)   −   � r∈R #(r,(si ,s−i ))� j=1 cr (j)   Note that #(r, (si, s−i)) =    #(r, s−i) + 1 if r ∈ si #(r, s−i) if r � si We obtain ... 353 Proof of Theorem 98 (Cont.) −P(si, s−i) = � r∈R #(r,(si ,s−i ))� j=1 cr (j) = � r∈R�si #(r,(si ,s−i ))� j=1 cr (j) + � r∈si #(r,(si ,s−i ))� j=1 cr (j) = � r∈R�si #(r,s−i )� j=1 cr (j) + � r∈si #(r,s−i )+1� j=1 cr (j) = � r∈R�si #(r,s−i )� j=1 cr (j) + � r∈si #(r,s−i )� j=1 cr (j) + � r∈si cr (#(r, s−i) + 1) = � r∈R #(r,s−i )� j=1 cr (j) + � r∈si cr (#(r, s−i) + 1) Similarly, −P(s� i , s−i) = � r∈R #(r,s−i )� j=1 cr (j) + � r∈s� i cr (#(r, s−i) + 1) 354 Proof of Theorem 98 (Cont.) Now we can easily finish the proof of Theorem 98 P(si, s−i) − P(s� i , s−i) = =   � r∈R #(r,s−i )� j=1 cr (j) + � r∈s� i cr (#(r, s−i) + 1)   −   � r∈R #(r,s−i )� j=1 cr (j) + � r∈si cr (#(r, s−i) + 1)   = � r∈s� i cr (#(r, s−i) + 1)) − � r∈si cr (#(r, s−i) + 1) = � r∈s� i cr (#(r, (s� i , s−i))) − � r∈si cr (#(r, (si, s−i))) = ui(si, s−i) − ui(s� i , s−i) � 355 Complexity of Congestion Games For concreteness, assume cr (j) = ar · j + br where ar , br are some non-negative constants. Myopic best response can be used to compute Nash equilibria but how many steps it makes? A naive bound would be the number of strategy profiles which is exponential in the number of players. Assume that the cost functions have values in N. Then every step of the myopic best response increases P by at least one, which means that the procedure starting in s stops after at most −P(s) = � r∈R �#(r,s) j=1 cr (j) steps. This gives a pseudo-polynomial time procedure. How many steps are really needed? On some instances any sequence of improvement steps to NE is of exponential length. In fact, the problem of computing NE in congestion games is PLS-complete. PLS class (Polynomial Local Search) models the difficulty of finding a locally optimal solution to an optimization problem (e.g. travelling salesman is PLS-complete). 356 Complexity of Atomic Routing Games Finding Nash equilibria in Atomic Routing Games is PLS-complete even if the cost functions are linear. There is a polynomial time algorithm for symmetric atomic routing games with non-decreasing cost functions based on a reduction to the minimum-cost flow problem. Here symmetric means that all players have the same source z and the same target t. Hence they also choose among the same simple paths. ⇒ For every edge in the routing graph G (left), there are n = 3 edges of capacity one in the minimum-cost flow network (right), each with one of the possible costs of the edge in G. 357 Non-Atomic Selfish Routing � So far we have considered situations where each player (user, driver) has enough "weight" to explicitly influence payoffs of others (so a deviation of one player causes changes in payoffs of other players). � In many applications, especially in the case of highway traffic problems, individual drivers have negligible influence on each other. What matters is a "distribution" of drivers on highways. � To model such situations we use non-atomic routing games that can be seen as a limiting case of atomic selfish routing with the number of players going to ∞. 358 Non-Atomic Routing Games A Non-Atomic Routing Game consists of � a directed multigraph G = (V, E, δ), � n source-target pairs (z1, t1), . . . , (zn, tn), � for each i = 1, . . . , n, the amount of traffic µi ∈ R≥0 from zi to ti, � for each e ∈ E a cost function ce : R≥0 → R such that ce(x) is the cost of routing through the link e if the amount of traffic on e is x ∈ R≥0. For i = 1, . . . , n, let Pi be the set of all simple paths from zi to ti. Intuitively, there are uncountably many players, represented by [0, µi], going from zi to ti, each player chooses his path so that his total cost is minimized. Assume that Pi ∩ Pj = ∅ for i � j. (This also implies that for all i � j we have that either zi � zj, or ti � tj.) Denote by P the set of all "relevant" simple paths �n i=1 Pi. Question: What is a "stable" distribution of the traffic among paths of P ? 359 Non-Atomic Routing Games A traffic distribution d is a function d : P → R≥0 such that� p∈Pi d(p) = µi. Denote by D the set of all traffic distributions. Let us fix a traffic distribution d ∈ D. Given an edge e ∈ E, we denote by g(d, e) the amount of congestion on the edge e : g(d, e) = � p∈P : e∈p d(p) Given p ∈ P, the payoff for players routing through p ∈ P is defined by u(d, p) = − � e∈p ce(g(d, e)) Definition 99 A traffic distribution d ∈ D is a Nash equilibrium if for every i = 1, . . . , n and every path p ∈ Pi such that d(p) > 0 the following holds: u(d, p) ≥ u(d, p� ) for all p� ∈ Pi 360 Price of Anarchy Theorem 100 Every non-atomic routing game has a Nash equilibrium. We define a social cost of a traffic distribution d by C(d) = � p∈P d(p) · (−u(d, p)) = � p∈P d(p) · � e∈p ce(g(d, e)) Theorem 101 All Nash equilibria in non-atomic routing games have the same social cost. A price of anarchy is defined by PoA = C(d∗) mind C(d) where d∗ is a (arbitrary) Nash equilibrium Intuitively, PoA is the proportion of additional social cost that is incurred because of agents’ self-interested behavior. 361 Price of Anarchy Theorem 102 (Roughgarden-Tardos’2000) For all non-atomic routing games with linear cost functions holds PoA ≤ 4 3 and this bound is tight (e.g. the Pigou’s example). The price of anarchy can be defined also for atomic routing games: PoAnon−atom := maxs∗ is NE �n i=1(−ui(s∗ )) mins∈S �n i=1(−ui(s)) (Intuitively, �n i=1(−ui(s)) is the total amount paid by all players playing the strategy profile s.) Theorem 103 (Christodoulou-Koutsoupias’2005) For all atomic routing games with linear cost functions holds PoAnon−atom ≤ 5 2 (which is again tight, just like 4 3 for non-atomic routing.) 362 Braess’s Paradox For an example see the green board. Real-world occurences (Wikipedia): � In Seoul, South Korea, a speeding-up in traffic around the city was seen when a motorway was removed as part of the Cheonggyecheon restoration project. � In Stuttgart, Germany after investments into the road network in 1969, the traffic situation did not improve until a section of newly built road was closed for traffic again. � In 1990 the closing of 42nd street in New York City reduced the amount of congestion in the area. � In 2012, scientists at the Max Planck Institute for Dynamics and Self-Organization demonstrated through computational modeling the potential for this phenomenon to occur in power transmission networks where power generation is decentralized. � In 2012, a team of researchers published in Physical Review Letters a paper showing that Braess paradox may occur in mesoscopic electron systems. They showed that adding a path for electrons in a nanoscopic network paradoxically reduced its conductance. 363