Worlds with many intelligent agents An important consideration in AI, as well as games, distributed systems and networking, economics, sociology, political.

Slides:



Advertisements
Similar presentations
Monte Hall Problem Let’s Draw a Game Tree… Problem 6, chapter 2.
Advertisements

NON - zero sum games.
Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
Clicker Question-A Chicken Game 0, 0 0, 1 1, 0 -10, -10 Swerve Hang Tough Swerve Hang Tough Player 2 Pllayer 1 Does either player have a dominant strategy?
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
Game Theory S-1.
Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.
6-1 LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems
Chapter 6 Game Theory © 2006 Thomson Learning/South-Western.
1 Chapter 4: Minimax Equilibrium in Zero Sum Game SCIT1003 Chapter 4: Minimax Equilibrium in Zero Sum Game Prof. Tsang.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
© 2015 McGraw-Hill Education. All rights reserved. Chapter 15 Game Theory.
Game Theory Eduardo Costa. Contents What is game theory? Representation of games Types of games Applications of game theory Interesting Examples.
Network Theory and Dynamic Systems Game Theory: Mixed Strategies
Multi-player, non-zero-sum games
Part 3: The Minimax Theorem
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
An Introduction to Game Theory Part I: Strategic Games
Game Theory Part 5: Nash’s Theorem.
2008/02/06Lecture 21 ECO290E: Game Theory Lecture 2 Static Games and Nash Equilibrium.
Chapter 6 © 2006 Thomson Learning/South-Western Game Theory.
5/16/20151 Game Theory Game theory was developed by John Von Neumann and Oscar Morgenstern in Economists! One of the fundamental principles of.
Game Theory Analysis Sidney Gautrau. John von Neumann is looked at as the father of modern game theory. Many other theorists, such as John Nash and John.
Eponine Lupo.  Game Theory is a mathematical theory that deals with models of conflict and cooperation.  It is a precise and logical description of.
Check your (Mis)understanding? Number 3.5 page 79 Answer Key claims that: For player 1 a strictly dominates c For player 2, y strictly dominates w and.
Slide 1 of 13 So... What’s Game Theory? Game theory refers to a branch of applied math that deals with the strategic interactions between various ‘agents’,
Chapter 12 Choices Involving Strategy McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved.
Review: Game theory Dominant strategy Nash equilibrium
Nash Equilibrium Econ 171. Suggested Viewing A Student’s Suggestion: Video game theory lecture Open Yale Economics Ben Pollack’s Game Theory Lectures.
1 Game Theory Here we study a method for thinking about oligopoly situations. As we consider some terminology, we will see the simultaneous move, one shot.
Game Theory Here we study a method for thinking about oligopoly situations. As we consider some terminology, we will see the simultaneous move, one shot.
Introduction to Game Theory and Behavior Networked Life CIS 112 Spring 2009 Prof. Michael Kearns.
Game Applications Chapter 29. Nash Equilibrium In any Nash equilibrium (NE) each player chooses a “best” response to the choices made by all of the other.
Today: Some classic games in game theory
Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer
CPS 170: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
Social Choice Session 7 Carmen Pasca and John Hey.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Microeconomics Course E John Hey. Examinations Go to Read.
Game Theory Robin Burke GAM 224 Spring Outline Admin Game Theory Utility theory Zero-sum and non-zero sum games Decision Trees Degenerate strategies.
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Lecture 2: two-person non.
Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Introduction Giovanni Neglia.
Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.
The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.
CPS 270: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
Strategic Behavior in Business and Econ Static Games of complete information: Dominant Strategies and Nash Equilibrium in pure and mixed strategies.
Shall we play a game? Game Theory and Computer Science Game Theory /06/05 - Zero-sum games - General-sum games.
CPS 570: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.
Choose one of the numbers below. You will get 1 point if your number is the closest number to 3/4 of the average of the numbers chosen by all class members,
Lec 23 Chapter 28 Game Theory.
By: Donté Howell Game Theory in Sports. What is Game Theory? It is a tool used to analyze strategic behavior and trying to maximize his/her payoff of.
Game Theory Dr. Andrew L. H. Parkes “Economics for Business (2)” 卜安吉.
Games, Strategies, and Decision Making By Joseph Harrington, Jr. First Edition Chapter 4: Stable Play: Nash Equilibria in Discrete Games with Two or Three.
John Forbes Nash John Forbes Nash, Jr. (born June 13, 1928) is an American mathematician whose works in game theory, differential geometry, and partial.
Mixed Strategies Keep ‘em guessing.
Game Theory M.Pajhouh Niya M.Ghotbi
Check your (Mis)understanding?
Introduction to Game Theory
Choices Involving Strategy
Multiagent Systems Game Theory © Manfred Huber 2018.
Chapter 29 Game Theory Key Concept: Nash equilibrium and Subgame Perfect Nash equilibrium (SPNE)
Multiagent Systems Repeated Games © Manfred Huber 2018.
Molly W. Dahl Georgetown University Econ 101 – Spring 2009
Presentation transcript:

Worlds with many intelligent agents An important consideration in AI, as well as games, distributed systems and networking, economics, sociology, political science, international relations, and other disciplines

Multi-agent Systems, or “Games” Worlds with multiple agents, somtimes called “games”, are VASTLY more complicated than worlds with single agents. They’re also more interesting and fun! There are many different aspects to learn about: -Representations (we will consider two, but there are MANY) -Evaluation metrics (I’ll show you a few standard ones, but again there are MANY) -Inference and planning (we will talk about a few techniques) -Learning (We won’t really cover this) -Communication (We won’t directly cover this) -Computational Complexity (I’ll mention this in passing) -Applications (We’ll look at several examples, but there are TONS)

Example Game (Prisoner’s Dilemma) Student A and Student B have been naughty, cheating on an exam. Their teacher finds some evidence, and sends them to the dean’s office, where the dean grills them individually to find out more. Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Example Game (Prisoner’s Dilemma) Definitions: the set of choices available to each agent is called their action set, denoted Actions(A) or Actions(B). Each square in the grid is called an outcome. Mathematically, let a be an action in Actions(A), and b in Actions(B). O(a, b) is the outcome square. O A (a, b) is the reward for A, and O B (a, b) is the reward for B. This grid is called a normal form representation of the game. Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Quiz: Best outcome? Which outcome is the best, and why? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Answer: Best outcome? The answer depends. Let’s look at a few definitions of “best”. Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Definition: Dominant Strategy Definition: A strategy (or pure strategy) for player A (ditto for B) is a selection of one of the actions from Actions(A). Definition: A’s strategy a ∊ Actions(A) is dominant if ∀b ∊ Actions(B), ∀ a’ ≠a ∊ Actions(A). O A (a, b) > O A (a’, b) Another way of saying this is: a is a dominant strategy for A if it is better than any other strategy a’ for A, no matter what strategy B chooses.

Quiz: Dominant Strategy Definition: A’s strategy a ∊ Actions(A) is dominant if ∀b ∊ Actions(B), ∀ a’ ≠a ∊ Actions(A). O A (a, b) > O A (a’, b) Does A have a dominant strategy for the game below? What is it? What about B? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Answer: Dominant Strategy Definition: A’s strategy a ∊ Actions(A) is dominant if ∀b ∊ Actions(B), ∀ a’ ≠a ∊ Actions(A). O A (a, b) > O A (a’, b) Does A have a dominant strategy for the game below? What is it? What about B? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Dominant strategy for A! Dominant strategy for B!

Definition: Pareto Optimal Definition: An outcome o is called pareto optimal if there is no other outcome o’ such that all players prefer o’ to o. Notice: pareto optimality can lead to lots of optimal outcomes, not just one. That’s because o’ is considered better than o only if ALL players prefer o’ to o.

Quiz: Pareto Optimal Definition: An outcome o is called pareto optimal if there is no other outcome o’ such that all players prefer o’ to o. Quiz: which outcome(s) is (are) pareto optimal? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Answer: Pareto Optimal Definition: An outcome o is called pareto optimal if there is no other outcome o’ such that all players prefer o’ to o. Quiz: which outcome(s) is (are) pareto optimal? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Definition: (Nash) Equilibrium Definition: An outcome O(a, b) is called a Nash Equilibrium (or just equilibrium) if 1.∀ a’ ≠a ∊ Actions(A). O A (a, b) >= O A (a’, b) and 2.∀b ’ ≠b ∊ Actions(B). O B (a, b) >= O B (a, b’) Another way of saying this is that an outcome is an equilibrium if each player is as happy with this outcome as any other available option, assuming the other player sticks with his equilibrium strategy. Notice that this means an outcome is stable, in the sense that both players have no incentive to change their strategy.

Quiz: (Nash) Equilibrium Definition: An outcome O(a, b) is called a Nash Equilibrium if 1.∀ a’ ≠a ∊ Actions(A). O A (a, b) >= O A (a’, b) and 2.∀b ’ ≠b ∊ Actions(B). O B (a, b) >= O B (a, b’) Which outcome(s) is (are) a Nash equilibrium? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Answer: (Nash) Equilibrium Definition: An outcome O(a, b) is called a Nash Equilibrium if 1.∀ a’ ≠a ∊ Actions(A). O A (a, b) >= O A (a’, b) and 2.∀b ’ ≠b ∊ Actions(B). O B (a, b) >= O B (a, b’) Which outcome(s) is (are) a Nash equilibrium? Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing Outcome:

Answer: (Nash) Equilibrium One reason that the prisoner’s dilemma is famous: The only outcome that is a Nash equilibrium is also the only outcome that is NOT a pareto optimum. This is not true in all games; it depends on the rewards! For the prisoner’s dilemma, it means that the only stable solution is probably the worst possible solution for the players. (It’s sort of a depressing example.) Dean now has lots of evidence on both students, both are suspended for the semester. A: -7, B: -7 Dean has lots of evidence on A, gives B a light punishment for cooperating. A: -10, B: -1 Dean has lots of evidence on B, gives A a light punishment for cooperating. A: -1, B: -10 Dean has little evidence on both, gives both some penalty. A: -3, B: -3 A rats out BA says nothing B rats out A B says nothing

Quiz: The game of Chicken (a.k.a. “Hawk-Dove”, “snow-drift”) What are the dominant strategies, pareto optima, and Nash equlibria for the Chicken game? Neither player wins; a tie. P1: 0, P2: 0 Player 1 wins. P1: +1, P2: -1 Player 2 wins. P1: -1, P2: +1 The players crash; a catastrophe! P1: -10, P2: -10 SwerveStraight Swerve Straight Player 1 Player 2

Answer: The game of Chicken (a.k.a. “Hawk-Dove”, “snow-drift”) What are the dominant strategies, pareto optima, and Nash equlibria for the Chicken game? Neither player wins; a tie. P1: 0, P2: 0 Player 1 wins. P1: +1, P2: -1 Player 2 wins. P1: -1, P2: +1 The players crash; a catastrophe! P1: -10, P2: -10 SwerveStraight Swerve Straight Player 1 Player 2 No dominant strategies! Pareto Optimum Nash Equilibrium

Quiz: Coordination Games What are the dominant strategies, pareto optima, and Nash equlibria for each of the games below? P1: 10, P2: 10P1: -10, P2: -10 P1: 10, P2: 10 LeftRight Left Right 10, 50, 0 5, 10 PartyHome Party Home Drive on which side of the road? “Battle of the Sexes” game 10, 100, 0 5, 5 PartyHome Party Home Pure coordination game 10, 107, 0 0, 77, 7 StagHare Stag Hare “Stag Hunt” game

Answer: Coordination Games What are the dominant strategies, pareto optima, and Nash equlibria for each of the games below? P1: 10, P2: 10P1: -10, P2: -10 P1: 10, P2: 10 LeftRight Left Right 10, 50, 0 5, 10 PartyHome Party Home Drive on which side of the road? “Battle of the Sexes” game 10, 100, 0 5, 5 PartyHome Party Home Pure coordination game 10, 107, 0 0, 77, 7 StagHare Stag Hare “Stag Hunt” game Pareto Optimum Nash Equilibrium Dominant Strategy

Humans don’t necessarily play Nash equlibria The “Guess 2/3 of the Average” Game is famous because in experiments with people, they often don’t play Nash equilibrium strategies. (Other games like this are the centipede game and the prisoner’s dilemma.) Thus Nash equilibria aren’t perfect models of human behavior. Let’s try it. Here are the rules to “Guess 2/3 of the average”: 1.Everyone guess a number between 0 and 100 (integer or real). Once everyone guesses, we’ll compute the average. 2.The set of people who come closest to 2/3 of the average are the winners (payoff 1), everyone else loses (payoff 0).

Nash equilibrium for “Guess 2/3 of the Average” If everyone guesses 100, then 2/3 of the average is 66.7, so 66.7 is the biggest possible result, so everyone has an incentive to guess 66.7 or less, so anything more than 66.7 can’t be a Nash equilibrium. If everyone guesses 66.7, then 2/3 of the average will be around 44.4, so everyone has an incentive to guess lower, so guessing more than 44.4 can’t be a Nash equilibrium. … If everyone guesses 0, then we have a Nash equilibrium (they’d all guess right and win, and there’s no incentive for anyone to change their guess assuming everyone else remains at 0).

Constant-Sum Games 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock Notice: if you add up the reward for P1 and P2 in each square, you get 0. The fact that all squares have the same sum makes this a “constant- sum” game. This game is in fact a “zero-sum” game.

Constant-Sum Games 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock Actually, any constant-sum game can be converted to an equivalent zero- sum game by subtracting a constant from every payoff. They are “equivalent” in that this transformation preserves equilibria, optimal, and dominant strategies.

Constant-Sum Games 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock In constant-sum games, any gain for one player is offset by a loss for the other player. So these are hyper- competitive games.

Quiz: Pure Strategy Nash Equilibria 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock Can you find a Nash equilibrium for Rock-Paper- Scissors?

Answer: Pure Strategy Nash Equilibria 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock Can you find a Nash equilibrium for Rock-Paper- Scissors? No “pure” strategy for either player can lead to a Nash equilibrium for this game! (Our definition of “strategy” is equivalent to “pure strategy”. We’ll talk about unpure strategies next.)

Definition: Mixed Strategy A mixed strategy for player A is a probability distribution over the set of available actions, Actions(A). For instance, for Rock-Paper-Scissors, the distribution P(Rock) = 1/3 P(Paper) = 1/3 P(Scissors) = 1/3 is a mixed strategy. A pure strategy is also a mixed strategy, but with a probability distribution that places all the probability on one outcome. E.g.: P(Rock) = 0, P(Paper) = 1, P(Scissors) = 0.

Definition: Mixed Strategy Nash Equilibrium A mixed strategy profile for players A and B is a pair of mixed strategies, distribution P A for player A and distribution P B for player B. A mixed strategy Nash equilibrium is a mixed strategy profile (P A, P B ) such that neither player would gain by choosing a different mixed strategy, assuming the other player’s mixed strategy stays the same.

Quiz: Mixed Strategy Nash Equilibria 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock Can you find a mixed-strategy Nash equilibrium for Rock Paper Scissors? (This can be hard in general, but see if you can guess it for this game.)

Answer: Mixed Strategy Nash Equilibria 0, 0+1, -1-1, +1 0, 0+1, -1 -1, +10, 0 RockScissors Player 1 Player 2 Paper Scissors Paper Rock The mixed strategy profile where each player has 1/3 probability for each action is a Nash equilibrium. (It’s the only Nash equilibrium for this game.)

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash equilibrium. Nash, JohnNash, John (1951) "Non-Cooperative Games" The Annals of Mathematics 54(2): The Annals of Mathematics (John Nash did not call them “Nash equilibria”, that name came later.) He shared the 1994 Nobel Memorial Prize in Economic Sciences with game theorists Reinhard Selten and John Harsanyi for his work on Nash equilibria.Nobel Memorial Prize in Economic SciencesReinhard SeltenJohn Harsanyi He suffered from schizophrenia in the 1950s and 1960s, as depicted in the 1998 film, “A Beautiful Mind”. He nevertheless recovered enough to return to academia and continue his research.

More on Constant-Sum Games Minimax Theorem (John von Neumann, 1928): For every two-person, zero-sum game with finitely many pure strategies, there exists a mixed strategy for each player and a value V such that: -Given player 2’s strategy, the best possible payoff for player 1 is V -Given player 1’s strategy, the best possible payoff for player 2 is –V. The existence of strategies part is a special case of Nash’s theorem, and a precursor to it. This basically says that player 1 can guarantee himself a payoff of at least V, and player 2 can guarantee himself a payoff of at least –V. If both players play optimally, that’s exactly what they will get. It’s called “minimax” because the players get this value by pursuing a strategy that tries to minimize the maximum payoff of the other player. We’ll come back to this. Definition: The value V is called the value of the game. Eg: The value of Rock-paper-scissors is 0; the best that P1 can hope to achieve, assuming P2 plays optimally (1/3 probability of each action), is a payoff of 0.