Evaluation Through Conflict Martin Zinkevich Yahoo! Inc.

Slides:



Advertisements
Similar presentations
The Basics of Game Theory
Advertisements

Building Agents for the Lemonade Game Using a Cognitive Hierarchy Population Model Michael Wunder Michael Kaisers Michael Littman John Yaros.
Mechanism Design without Money Lecture 1 Avinatan Hassidim.
Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)
6-1 LECTURE 6: MULTIAGENT INTERACTIONS An Introduction to MultiAgent Systems
Game Theory and Computer Networks: a useful combination? Christos Samaras, COMNET Group, DUTH.
EKONOMSKA ANALIZA PRAVA. Game Theory Outline of the lecture: I. What is game theory? II. Elements of a game III. Normal (matrix) and Extensive (tree)
Response Regret Martin Zinkevich AAAI Fall Symposium November 5 th, 2005 This work was supported by NSF Career Grant #IIS
Coye Cheshire & Andrew Fiore March 21, 2012 // Computer-Mediated Communication Collective Action and CMC: Game Theory Approaches and Applications.
Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.
Choosing Sample Size and Using Your Calculator Presentation 9.3.
Game Theory The study of rational behavior among interdependent agents Agents have a common interest to make the pie as large as possible, but Agents have.
An Introduction to Game Theory Part I: Strategic Games
Nash Equilibrium: Theory. Strategic or Simultaneous-move Games Definition: A simultaneous-move game consists of: A set of players For each player, a set.
Eponine Lupo.  Game Theory is a mathematical theory that deals with models of conflict and cooperation.  It is a precise and logical description of.
Algoritmi per Sistemi Distribuiti Strategici
Story time! Robert Axelrod. Contest #1 Call for entries to game theorists All entrants told of preliminary experiments 15 strategies = 14 entries + 1.
INFORMS 2006, Pittsburgh, November 8, 2006 © 2006 M. A. Zinkevich, AICML 1 Games, Optimization, and Online Algorithms Martin Zinkevich University of Alberta.
Lectures in Microeconomics-Charles W. Upton Game Theory.
1 Game Theory Here we study a method for thinking about oligopoly situations. As we consider some terminology, we will see the simultaneous move, one shot.
Design of Multi-Agent Systems Teacher Bart Verheij Student assistants Albert Hankel Elske van der Vaart Web site
2006 AAAI Computer Poker Competition Michael Littman Rutgers University Martin Zinkevich Christian Smith Luke Duguid U of Alberta.
Mechanism Design: Online Auction or Packet Scheduling Online auction of a reusable good (packet slots) Agents types: (arrival, departure, value) –Agents.
1 Kunstmatige Intelligentie / RuG KI Reinforcement Learning Johan Everts.
Introduction to Game Theory and Behavior Networked Life CIS 112 Spring 2009 Prof. Michael Kearns.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
Coye Cheshire & Andrew Fiore June 28, 2015 // Computer-Mediated Communication Game Theory, Games, and CMC.
Game Theoretic Analysis of Oligopoly lr L R 0000 L R 1 22 The Lane Selection Game Rational Play is indicated by the black arrows.
Segregation and Neighborhood Interaction Work in progress Jason Barr, Rutgers Newark Troy Tassier, Fordham October 31, 2006.
Experts Learning and The Minimax Theorem for Zero-Sum Games Maria Florina Balcan December 8th 2011.
Poki: The Poker Agent Greg Priebe Zak Knudson. Overview Texas Hold’em poker Architecture and Opponent Modeling of Poki Improvements from past Poki Betting.
C H A P T E R 13 Game Theory and Competitive Strategy CHAPTER OUTLINE
A Study of Computational and Human Strategies in Revelation Games 1 Noam Peled, 2 Kobi Gal, 1 Sarit Kraus 1 Bar-Ilan university, Israel. 2 Ben-Gurion university,
Exponential Moving Average Q- Learning Algorithm By Mostafa D. Awheda Howard M. Schwartz Presented at the 2013 IEEE Symposium Series on Computational Intelligence.
Learning in Multiagent systems
Games People Play. 4. Mixed strategies In this section we shall learn How to not lose a game when it appears your opponent has a counter to all your moves.
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Presenter: Chih-Yuan Chou GA-BASED ALGORITHMS FOR FINDING EQUILIBRIUM 1.
Strategic Decisions in Noncooperative Games Introduction to Game Theory.
A Little Game Theory1 A LITTLE GAME THEORY Mike Bailey MSIM 852.
Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking to strategy games involving deep thought. Many researchers.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
Codeville Revision control computer program – Written by Ross Cohen with some design by Bram Cohen – Use Python – Use merging algorithm called “Codeville.
I THINK THAT YOU THINK THAT I THINK - GAME THEORY AND VIDEO GAMES Jonas Heide Smith Game Analysis ITU, “If I had my way, a solid.
Section 2 – Ec1818 Jeremy Barofsky
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
Robert Axelrod’s Tournaments Robert Axelrod’s Tournaments, as reported in Axelrod, Robert. 1980a. “Effective Choice in the Prisoner’s Dilemma.” Journal.
How to Analyse Social Network? : Part 2 Game Theory Thank you for all referred contexts and figures.
Copyright © 2009 Pearson Education, Inc. All rights reserved. 1 Oligopoly.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.
The Prisoner’s Dilemma or Life With My Brother and Sister John CT.
OPPONENT EXPLOITATION Tuomas Sandholm. Traditionally two approaches to tackling games Game theory approach (abstraction+equilibrium finding) –Safe in.
Game Theory Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents. Been used to study –Market.
Strategic Management/ Business Policy Power Point Set #9: Game Theory and Strategy.
Management support systems II
Stochastic tree search and stochastic games
Joint work with Sam Ganzfried
Reinforcement learning (Chapter 21)
tit-for-tat algorithm
Communication Complexity as a Lower Bound for Learning in Games
Extensive-form games and how to solve them
Computer-Mediated Communication
Team-building Workshop: Heads or Tails
CS51A David Kauchak Spring 2019
Presentation transcript:

Evaluation Through Conflict Martin Zinkevich Yahoo! Inc.

Who was I Worked with U Alberta Computer Poker Research Group – Designed Counterfactual Regret Algorithm – Theory behind DIVAT Worked on AAAI Computer Poker Competition – 2006 as lead programmer, 2007 as chair Work used in Man Vs Machine

Who am I Run the Lemonade Stand Game Competition Work with Yahoo Anti-Abuse Team

AAAI Computer Poker Competition 5 years running Now the ANNUAL Computer Poker Competition Latest-11 universities et al

Competitions: Science vs Entertainment

AAAI Computer Poker Competition May The Best Program Win! And Win Again IF WE PLAYED AGAIN!

Head to Head VS for 1000 hands

Head to Head VS for 1000 hands

All Combinations 7,-710,-10 -7,75,-5 -10,10-5,5

OK, But Who Won? Online: Maximize total winnings Equilibrium: Maximize number of people I can win money from (or don’t lose against)

Why a New Competition? Computing Equilibria ✓ Choosing Equilibria ?

Bach or Stravinsky 2,12,10,00,0 0,00,01,21,2

Big Question: How Do (or Would) People Get to Nash Equilibria?

Solvable Games $

Unsolvable Games ∞ $ ?

An Old Idea Think about learning in the presence of other intelligent agents. Prove cool stuff about your learning algorithm given: – constraints about the adversary – constraints about the game

Solving the Unsolvable In current competitions, people are often applying techniques that are effective in solvable games, even when the game is not solvable. In what competitions is it useless to approximate the game as solvable?

Axelrod’s Iterated Prisoner’s Dilemma A competition between many competitors. One entry: tit-for-tat (Anatol Rapaport) – Nice (initially) – Retaliating – Forgiving – Non-envious Learned that cooperation has value, but: – Cooperate with whom? – How do we cooperate?

The Lemonade Stand Game

What Is The Lemonade Stand Game? Every round for 100 rounds: – each person selects an action privately – then, the actions are revealed The score of a player is the distance clockwise to the next player plus the distance counterclockwise.

Key Observations A constant-sum game between 3 players. – For every gain, someone has to lose. Possibilities For Cooperation – Opposite sides of the circle, “sandwiching” Not a “Solvable Game” (Nash, 1951) – Playing equilibrium strategies is not advisable Easy To Set “Table Image” – The constant strategy often evokes cooperative behavior Existing Techniques Fail – Experts algorithms lose to constant strategy Strategy #1: Play Constant Strategy #2: Play Opposite Strategy #3: Sandwich

Competition Structure Every set of three players played 100 rounds 180 times (1.5 million rounds total) Highest Total Score Wins Mean, Standard Error can be calculated

Competitors 28 players, 9 teams – University of Southampton/Imperial College London (Soton) – Yahoo! Inc. (Pujara) – Rutgers University (RL3) – Brown University (Brown) – Carnegie Mellon (2 teams-Waugh, ACTR) – University of Michigan (FrozenPontiac) – Princeton University (Schapire) – (Greg Kuhlmann)

Competition Results Competitor Score Per Round

Results Competitor Score Per Round-8 Modified ConstantUniformly Random

Restricting to Top 6 Competitor Score Per Round-8

Restricting to Top 4

Teach Simply! EQUILIBRIUM FREE =

Learn = = = ?

= = 10 7

The High Level Phenomenal Intelligence: the observed behavior used by a set of people at a point in time for some task.

Lofty Goals Phenomenal Intelligence: the observed behavior used by a set of people at a point in time for some task. behavior: a fully specified strategy. used: actually leveraged

Practical Concessions Phenomenal Intelligence: the observed behavior used by a set of people at a point in time for some task. Not any intelligent agent Not any time (people change) Not any task (context matters)

Thank You