Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

The Matching Hypothesis Jeff Schank PSC 120. Mating Mating is an evolutionary imperative Much of life is structured around securing and maintaining long-term.
Reinforcement Learning
Reaching Agreements II. 2 What utility does a deal give an agent? Given encounter  T 1,T 2  in task domain  T,{1,2},c  We define the utility of a.
Infinitely Repeated Games
NON - zero sum games.
Ultimatum Game Two players bargain (anonymously) to divide a fixed amount between them. P1 (proposer) offers a division of the “pie” P2 (responder) decides.
Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
3. Basic Topics in Game Theory. Strategic Behavior in Business and Econ Outline 3.1 What is a Game ? The elements of a Game The Rules of the.
Sta220 - Statistics Mr. Smith Room 310 Class #14.
Playing Evolution Games in the Classroom Colin Garvey GK-12 Fellow.
The basics of Game Theory Understanding strategic behaviour.
1 Game Theory. By the end of this section, you should be able to…. ► In a simultaneous game played only once, find and define:  the Nash equilibrium.
Chapter 6 Game Theory © 2006 Thomson Learning/South-Western.
Eponine Lupo.  Questions from last time  3 player games  Games larger than 2x2—rock, paper, scissors  Review/explain Nash Equilibrium  Nash Equilibrium.
Infinitely Repeated Games Econ 171. Finitely Repeated Game Take any game play it, then play it again, for a specified number of times. The game that is.
Learning in games Vincent Conitzer
On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy
Part 3: The Minimax Theorem
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
Maynard Smith Revisited: Spatial Mobility and Limited Resources Shaping Population Dynamics and Evolutionary Stable Strategies Pedro Ribeiro de Andrade.
Each year that a faculty raise is given, the faculty will first be evaluated for a market adjustment. The relative size of the money set aside for market.
ECO290E: Game Theory Lecture 4 Applications in Industrial Organization.
Section 9.1 ~ Fundamentals of Hypothesis Testing Introduction to Probability and Statistics Ms. Young.
Chapter 6 © 2006 Thomson Learning/South-Western Game Theory.
Satisfaction Equilibrium Stéphane Ross. Canadian AI / 21 Problem In real life multiagent systems :  Agents generally do not know the preferences.
Yale 11 and 12 Evolutionary Stability: cooperation, mutation, and equilibrium.
A Memetic Framework for Describing and Simulating Spatial Prisoner’s Dilemma with Coalition Formation Sneak Review by Udara Weerakoon.
Why How We Learn Matters Russell Golman Scott E Page.
Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal.
Game Theory Statistics 802. Lecture Agenda Overview of games 2 player games representations 2 player zero-sum games Render/Stair/Hanna text CD QM for.
CPS Learning in games Vincent Conitzer
Bottom-Up Coordination in the El Farol Game: an agent-based model Shu-Heng Chen, Umberto Gostoli.
Learning dynamics,genetic algorithms,and corporate takeovers Thomas H. Noe,Lynn Pi.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Warm-up Problem 2 Write three positive integers in a line. In the space just below and between each pair of adjacent integers, write their difference.
Social Choice Session 7 Carmen Pasca and John Hey.
Learning in Multiagent systems
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.
Nash equilibrium Nash equilibrium is defined in terms of strategies, not payoffs Every player is best responding simultaneously (everyone optimizes) This.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Example Department of Computer Science University of Bologna Italy ( Decentralised, Evolving, Large-scale Information Systems (DELIS)
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Dynamic Games & The Extensive Form
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.
Evolving cooperation in one-time interactions with strangers Tags produce cooperation in the single round prisoner’s dilemma and it’s.
3.1.4 Types of Games. Strategic Behavior in Business and Econ Outline 3.1. What is a Game ? The elements of a Game The Rules of the Game:
Strategic Behavior in Business and Econ Static Games of complete information: Dominant Strategies and Nash Equilibrium in pure and mixed strategies.
CMSC 100 Multi-Agent Game Day Professor Marie desJardins Tuesday, November 20, 2012 Tue 11/20/12 1 Multi-Agent Game Day.
Statistics Overview of games 2 player games representations 2 player zero-sum games Render/Stair/Hanna text CD QM for Windows software Modeling.
Vincent Conitzer CPS Learning in games Vincent Conitzer
Replicator Dynamics. Nash makes sense (arguably) if… -Uber-rational -Calculating.
Lec 23 Chapter 28 Game Theory.
The Prisoner’s Dilemma or Life With My Brother and Sister John CT.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
March 1, 2016Introduction to Artificial Intelligence Lecture 11: Machine Evolution 1 Let’s look at… Machine Evolution.
ECON 330 Lecture 17 Monday, November 25.
The Matching Hypothesis
Multiagent Systems Game Theory © Manfred Huber 2018.
Chapter 29 Game Theory Key Concept: Nash equilibrium and Subgame Perfect Nash equilibrium (SPNE)
Introduction to Artificial Intelligence Lecture 11: Machine Evolution
Sampling Distributions
Evolving cooperation in one-time interactions with strangers
Molly W. Dahl Georgetown University Econ 101 – Spring 2009
How likely it is that some events will occur?
Presentation transcript:

Evolutionary Games The solution concepts that we have discussed in some detail include strategically dominant solutions equilibrium solutions Pareto optimal solutions best response solutions mixed strategy solutions

We now turn attention to another kind of equilibrium-based solution. This is a solution that is produced by some form of learning or adaptation process. we will focus on the kinds of things that can be learned in a population of learning agents. We'll have to be careful because evolution has a huge number of things that can affect it: mating, mutation, environment, catastrophes, other agents in the population, etc. We'll restrict attention to just a couple of these factors.

Reaching an equilibrium the main requirement for reaching an equilibrium in learning is that the learning algorithms stop changing. This type of equilibrium can be very weak as when, for example, a learning agent happens to select parameter values that cause another learning agent to stop adapting, and vice versa. Or, both agents get tired of adapting and just "freeze" their solutions even though they may not be good solutions. This type of equilibrium may also be weak because even the smallest perturbation to this type of equilibrium can cause the system to adapt to another solution. A stronger notion of equilibrium is a learned solution that is not easily changed by perturbing the system. We call such an equilibrium a stable solution.

Finally, not every learning process has an equilibrium. Since only certain types of learning processes and games produce these equilibria, the notion of a learning-based equilibrium is not as universal as the notion of a Nash equilibrium

In evolutionary games, the two main factors that contribute to what is learned are: 1.The types of interactions that occur between the agents in a population. 2.The rules that are applied to determine which strategies within the population are fit and therefore likely to be learned by the population.

Let's begin by using an example. Suppose that we have two large and separate groups of agents (males and females) who will be playing the battle of the sexes game. Suppose that each of these two groups has a mix of agents that either always play cooperate (vote for what other wants) or always play defect (vote for what it wants); One agent from each group, one male and one female, is selected at random, they each make their choice, and they get the reward that results.

Battle of the Sexes DefectCoop Defect1,13,2 Coop2,30,0

In these images, the x-axis represents the number of rounds that the game was played and the y-axis represents the percentage of the female group (red circles) and of the male group (green squares) that play always cooperate. Note that the two graphs represent the two most common outcomes --- all the females play always cooperate while all the males play always defect (top graph), or all the females play always defect while all the males play always cooperate (bottom graph). This should make some intuitive sense. If the two groups play a lot, then they should learn to settle on one of the two Pareto optimal, Nash equilibrium solutions, but which solution is chosen depends on the initial make-up of the group. For these simulations, the initial population was very close to 50/50, but with a small random perturbation towards either always defect or always cooperate for each group

Relative Fitness. When we look at the strategies, if 1/3 of the agents are playing strategy A and getting 1/3 of the total utility, they are getting what they expected – so they shouldn’t change. HOWEVER, if 1/3 are getting ½ of the total utility for all players, they are playing better than others. We will do better if we have MORE agents like these super achieving agents. But how many more? The simple thing to do is reset the agents so the number of each type of agent exactly matches the percent of utility that group achieved in the last round. When we are happy with the division (no under or over achieving group), we are done learning.

Imitator Dynamics Replicator dynamics and random pairings of solutions are not the only models for evolution. Thus, they are not the only learning models that have some claim to justification. We will explore a different technique for selecting the proportion of strategies that evolve from one generation to another, but first we will need to explore other models for selecting which agents interact with each other.

Playing with Neighbors In the previous section, agents were randomly paired with other agents from the group. From an evolutionary perspective, it sometimes makes more sense to assume that agents are paired with their neighbors rather than being randomly paired with any other agent. This pairing with neighbors can be implemented in two ways. Agents have some way to recognize another agent. If they are randomly paired with another agent that they do not like, they can ask to be reassigned. The reassignment will be random, but at least they get one chance to reject an undesirable agent and they therefore get more chances to interact with their friends. Agents are physically arranged in group. For example, agents may be arranged on a grid and restricted so that they can only interact with their immediate neighbors. These immediate neighbors can be defined as those agents to the N, S, E, or W of the agent, or to the N, NE, E, SE, S, SW, W, or NW of the agent. For another example, agents may be arranged on the perimeter of a circle and only able to interact with an agent to their right or left.

Standard evolutionary game (random interactions)  all Defect Modifications- spatial games: Interactions no longer random, but with spatial neighbours: Sum scores. Player with highest score of 9 shaded takes square (territory, food, mates) in next generation Some degree of cooperation evolves!

Imitator Dynamics When agents can only play with their neighbors, we can introduce a different way (different from replicator dynamics) of selecting which strategies propagate to the next generation. One way to do this is for an agent to imitate its most successful neighbor. The algorithm for doing this goes something like this: Interact with all of my neighbors (wraping around the board as needed), and let all my neighbors interact with their neighbors. After the interactions with my neighbors are complete, identify the interaction strategy from my neighbors that was most successful unless my current strategy beat all of my neighbors (in which case I'll stick to my strategy). Change my strategy to the most successful strategy of my neighbors -- imitate them -- on the next round. Imitator dynamics can produce vastly different results than replicator dynamics.

Battle of the Sexes Suppose we have 12 agents, and four strategies (as described in the homework). Suppose, initially, that there are equal numbers of each type of agent. If the strategies are equally good, we would expect that each type of agent would do equally well.

Relative fit We don’t need to worry about computing expected utility, as we will produce actual utility. For K times, we randomly select two players. They compete. We use gamma to decide how many times to repeat the interaction (as tit for tat strategies require repeated play with the same agent). We figure the average utility each agent made for a single interaction in each of the interactions they had. We don’t want to be biased by how long the interaction continues or on how many times the player was selected to play. Thus, we work with average utility earned. We pick K to be a large number so each player gets to play lots of times (so the average is representative). This is important because a score of 2.2 (averaged over 10 games) is not as certain as the same score averaged over games.

Redistributing Suppose after the first round we see the following average utilities: AgentStrategyUtility 1A2 2B2.1 3C1.7 4D3 5A2 6B1 7C.4 8D1.5 9A3 10B1.2 11C1.6 12D1.8

To find relative fit, we add up the total utility earned by all agents of the same type Agent A Agent B Agent C Agent D The total for all agents is 21.3 The percent of utility earned by each agent is shown to the left. Notice, that agents of strategy A should be 32% of the agents in the new round (up from 25% originally) while agents of type C should be only 17% in the next round.

So in the next round, we adjust the numbers of each agents AgentStrategyUtility 1A2 2B2.1 3C2 4D3 5A2 6B1 7C0.4 8D1.5 9A2.9 10D1.2 11A2.1 12D2 Percent Number of agents Agents A Agents B Agents C Agents D

As we continue… What we want to show is how the percents of each type of agent change over time.

Over time the percents could vary as shown below A B C D

Using excels Chart Wizard, we can visualize the results