An Introduction to Game Theory

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

An Introduction to Game Theory Presented as an undergraduate class in Multimedia Mathematics Paul Trafford paul.trafford@stx.oxon.org 6 July 2011 This presentation was originally delivered to 4th Year Management undergraduates at Gakushuin University, Tokyo.

PART A: Basic Concepts

Let’s Play a Game! Description: Bank has up to £1,000 to give away to the person or persons who choose the highest number. Players: Each individual student or group. Objective: To win as much as possible.  Rules. No communication between the players Choose a number N >= 1 and write it down on a piece of paper along with student/group name. The student(s) who chooses max. value of N wins total of £1,000/N Idea of this game in slides 2 and 3 is to indicate the distinction between non-cooperative and co-operative games, and how the latter can radically improve the returns.

Let’s Play the Game Again! Description: Bank has up to £1,000 to give away to the person or persons who choose the highest number. Players: Each individual student or group. Objective: To win as much as possible.  Rules. Communication allowed between the players Choose a number N >= 1 and write it down on a piece of paper along with student/group name. The student(s) who chooses max. value of N wins total of £1,000/N A. Everyone can agree to writing down £1 and they would all share £1,000 … This is an example of a cartel. But it is unstable: if someone changes their mind at the last minute … ? Consider if this game were repeated in real life … issues of trust – short term gains, long term losses.

What is Game Theory? Definition of Game Theory The analysis of competitive situations (or situations of conflict) using mathematical models Essential Terminology The way a game is played depends on strategy – a plan of action before the game begins. A solution is the adoption of a strategy that yields a particular outcome. Compare “solving” environmental problems with “solving” an equation. Another definition is in terms of situations where there is conflict, but this emphasizes oppositional tendencies as it is derived from Latin, literally meaning “strike” or “hit” together, as in war, but actually competitions may not have such opposition (as we will discuss in topic on cooperation).

Characteristics of Game Theory What is it about? Fundamentally about the study of decision-making Investigations are concerned more with choices and strategies than ‘best’ solutions. It seeks to answer the questions: What strategies are there? What kinds of solutions are there? Examples: Chess, Go, economic markets, politics, elections, family relationships, etc.

History (1) The study of games is many centuries old. More systematic developments in Game Theory took place in the first half of the 20th Century. Main Founders John Von Neumann (mathematician) Oskar Morgenstern (economist) Image sources: Los Alamos National Laboratory, http://www.lanl.gov/history/atomicbomb/images/NeumannL.GIF and American Mathematical Society, http://www.ams.org/samplings/feature-column/fcarc-rationality

History (2) Main publication: von Neumann & Morgenstern: Theory of Games and Economic Behaviour. Princeton University Press, 1944. Goal: Application of mathematical methods to broadly analyse games A new scientific approach to the study of economics. Applications: Aided by computers, theory has been broadly applied in large-scale operations such as international trade. (Philosophical) Assumptions: A certain predictability concerning human rationality…? A somewhat narrow definition of rationality?

Game Theory is inter-disciplinary Economics Mathematics Psychology Game Theory

What makes a Game? Elements in a Game One or more players – participants, each may be an individual, a group or organisation, a machine, and so on. One or more moves (or choices) – where a move is an action carried out during the game, including chance moves (when “nature plays a hand”) as in the toss of a coin. A set of outcomes – where an outcome is the result of the completion of one or more moves [e.g. game of chess may end in checkmate or a draw] Payoff – an amount received for a given outcome. Finally, a set of rules which specify the conditions for the players, moves, outcomes and payoffs.

Strategy How should one play the game? Definition: A strategy is a plan of action by which a player has a decision rule to determine their set of moves for every possible situation in a game. A strategy is said to be pure if it at every stage in the game it specifies a particular move with complete certainty. A strategy is said to be mixed if it applies some randomisation to at least one of the moves. For each game, there are typically multiple pure strategies. Note that the randomisation is a set of fixed probabilities, where the sum of the probabilities is 1. Strategy depends on the objective.

Strategy: Travel Example In this ‘game’ a ‘player’ is a commuter who is returning home from work – their objective is to return home as soon as possible. They can choose between train, bus and subway The first choice is ‘catch the train’, the second choice is ‘catch the bus’ and so on. • A commuter who always chooses to catch the train is following a ‘pure’ strategy. • A commuter who sometimes picks the train and sometimes the bus is following a ‘mixed’ strategy. Question: is this a one player game? Consider the traffic, the weather, … Comment: in practice, for complex games, it is not possible to determine a complete strategy. Photo credit: Nyao148 : Mejiro railway station http://en.wikipedia.org/wiki/File:Mejiro-Sta.JPG

Types of Games (1): Co-operative vs. Non co-operative Games Our first game (slide 2): non-cooperative Our second game (slide 3): cooperative Cooperation generally may lead to higher payoffs. Further Examples: Countries cooperate on trade (reduced tariffs) leading to boost in exports Two leading national social networking sites share technical knowledge and keep out an overseas competitor. Cartel: formation of monopoly by multiple organisations. For repeated games, the level of cooperation may change and payoffs fall!

Types of Games (2): Perfect vs. Imperfection Information A game is said to have perfect Information if all the moves of the game are known to the players when they make their move. Otherwise, the game has imperfect information. A large class of games of imperfect information are simultaneous games - games in which all players make their moves at the same time without knowing what the others will play. (The decisions may be made beforehand, but are not communicated). A game is said to be deterministic – if there are no chance moves. Otherwise, the game is non-deterministic.

A selection of games Go, Bridge, Ludo, Draughts, Scissors-Paper-Stone (jan-ken), Chess Monopoly, Noughts and Crosses (Tic-tac-toe), Scrabble Photo credits: Morten Johannes Ervik [Go], Jose Daniel Martinez [Chess], William Hartz (Scrabble), David ten Have (Ludo), WikiJET (Janken), Cyron Ray Macey (Tic Tac Toe), Dayland Shannon (Monopoly), Denise Griffin (Bridge), Steve Snodgrass (Draughts)

Imperfect Information How to classify? There are a number of [orthogonal] criteria that may be used as the basis for classifying games. A common one uses two: im/perfect information and chance/not chance. Perfect Information Imperfect Information Non-deterministic (Chance moves) ? Deterministic (No chance moves)

Classification of games: Perfect Information Imperfect Information Chance Moves No chance moves For each game, can ask class to choose which box. Photo credits: Morten Johannes Ervik [Go], Jose Daniel Martinez [Chess], William Hartz (Scrabble), David ten Have (Ludo), WikiJET (Janken), Cyron Ray Macey (Tic Tac Toe), Dayland Shannon (Monopoly), Denise Griffin (Bridge), Steve Snodgrass (Draughts)

Zero vs. Non-Zero-Sum Games One of the most important classifications . A game is said to be zero-sum if wealth is neither created nor destroyed among the players. A game is said to be non-zero-sum if wealth may be created or destroyed among the players (i.e. the total wealth can increase or decrease). All examples above are zero-sum because they are competitive leisure games. However, most real-life situations are non-zero-sum (as indicated, for example, by how economies can grow). (zero-sim: the total wealth is a constant)

PART B: Zero-Sum Games and Extended Form

1- Person Game: Tomato Plants (1) There are many 1 person games – including popular card games called ‘Patience’. They are instructive in decision-making. Example: Growing tomato plants…! Photo credit: Manjith Kainickara http://www.fotopedia.com/items/flickr-1061718736

1- Person Game: Tomato Plants (2) Objective: Grow a healthy tomato plant! Rules. One must make at least one move – plant a seed. Afterwards, one can make any number of moves: Player’s Moves Chance Moves Water plant Add fertiliser Communicate with plant Place in sunlight Shelter plant It rains It is stormy (heavy rain and wind) It is sunny There is frost Photo credit: Manjith Kainickara http://www.fotopedia.com/items/flickr-1061718736

1- Person Game: Tomato Plants (3) Outcomes Payoffs Plant doesn’t grow Plant grows, but has no fruit Plant grows, but has sour fruit Big ripe Tomatoes Small ripe tomatoes No tomatoes … etc. How to Model? Photo credit: Manjith Kainickara http://www.fotopedia.com/items/flickr-1061718736

2- Person Game: Simple Nim (Also called the ‘subtraction game’) Rules Two players take turns removing objects from a single heap or pile of objects. On each turn, a player must remove exactly one or two objects. The winner is the one who takes the last object Demonstration: http://education.jlab.org/nim/index.html

Simplified Nim: winning strategy: proof Lemma: Suppose that Players A and B are playing the Nim subtraction game where at each move a player can remove between 1 and c counters, then a player has a winning strategy if they can play a move that leaves k(c+1) counters. Proof We prove this for Player A (1) Base Case (k=1): Suppose A leaves c+1 counters, then B has to choose to remove x:1≤x≤c. This implies that there are y = c+1-x left, where 1 ≤ y ≤ c. Then A chooses y and wins.

Simplified Nim: proof (2) (2) Inductive step: Assume the statement is true for k=n (n≥1). I.e. if Player A leaves n(c+1) , then player A wins. Suppose A leaves (n+1)(c+1) counters left, i.e. nc+n+c+1 If B chooses x:1≤x≤c, this leaves nc+n+c+1-x. Then A chooses c+1-x, leaving n(c+1). (3) Completion of proof by induction: Thus if the case k=n is true, then so is the case k=n+1 We have the base case k=1, is true, so the statement is true for k=2,3,… and so on. The Lemma is thus proved by induction for all values of k.

Simple Nim: Another go? Rules Two players take turns removing objects from a single heap or pile of objects. On each turn, a player must remove exactly one or two objects. The winner is the one who takes the last object Strategy Leave a multiple of 3. Demonstration: http://education.jlab.org/nim/index.html

2- Person Game: Traditional Nim (General form) Rules Two players take turns removing objects from distinct heaps or piles of objects. On each turn, a player must remove at least one object, and may remove any number of objects provided they all come from the same heap. Strategy: “To find out which move to make, let X be the Nim-sum of all the heap sizes. Take the Nim-sum of each of the heap sizes with X, and find a heap whose size decreases. The winning strategy is to play in such a heap, reducing that heap to the Nim-sum of its original size with X.” - Wikipedia entry 6/2011 “Nim-sum” (⊕) [this is the exclusive OR sum applied successively) Robtex http://www.robtex.com/frames.htm#http://www.robtex.com/robban/nim1.htm Count the matches left to right and click on the next one to remove that and all the rest to the right.

Games in Extensive Form: Modelling by Trees We may model how the set of states in a game by using a tree with nodes and edges – called extensive form. Gambit is a set of software tools for doing computation on finite, non-cooperative games. It provides tree representations. Project founded in the mid-1980s by Richard McKelvey at the California Institute of Technology, USA. [ Gambit Web site: http://www.gambit-project.org/ ]

Gambit Example: Tree for Nim (2,2) We may model how the set of states in a game by using a tree with nodes and edges. E.g. (2,2) game: Demonstration..

PART C: Zero-Sum Games in Normal Form

Introducing 2 person games in Normal Form We represent the players by Player A and Player B (or simply A and B) and denote the moves they can make as A1, A2, …, An and B1, B2, …, Bm respectively. These moves are made simultaneously, so these are games of imperfect information. We represent the game in normal form, i.e. using payoff matrices, where the value of each cell (i,j) is the payoff corresponding to the moves Ai and Bj respectively.

Normal Form: example of 2*2 game In the following example, we treat the special case where each player has 2 moves. (Note the payoffs are the values that will be given to Player A) Each row or column of payoffs is called an imputation. Player A has two moves: A1 and A2. Player B has two moves: B1 and B2. The payoff for a game is given by the intersection. Thus if the moves are respectively A1 and B2, then the payoff is zero. B1 B2 A1 2 A2 4 -2 As this is a zero-sum game, it means whenever there is a value > 0 for Player A, there is a negative value for Player B and conversely.

Solutions of 2 person games A solution is expressed as a set of strategies for all players that yields a particular payoff, generally the optimal payoff for both players. This payoff is called the value of the game. Suppose, for example, each player adopts the strategy of choosing the move whose imputation contains the cell with the maximum payoff. Here, player A picks A2 as it contains a ‘4’, whereas player B selects B2 as it contains -2. This would yield 2 for player B. However, this is not a solution as it is not optimal for player A – they could always do better by playing A1. So the value of the game is >-2. B1 B2 A1 2 A2 4 -2

The Concept of Equilibrium (Pure Strategies) 1/2 So what strategies may yield optimal payoffs for both? Key concept: In an equilibrium, each player of the game has adopted a strategy that cannot improve his outcome, given the others' strategy. The method for this is: Player A considers each imputation and what is the least payoff value that may be gained by choosing that imputation. Similarly, Player B considers each imputation and what is the greatest payoff value that may be gained by choosing that imputation.

The Concept of Equilibrium (Pure Strategies) 2/2 Formally, this is the maximin criteria given by 𝑣 𝐿 =𝑚𝑎𝑥 𝑖 𝑚𝑖𝑛 𝑗 𝑒 𝑖𝑗 𝑣 𝑀 =𝑚𝑖𝑛 𝑗 𝑚𝑎𝑥 𝑖 𝑒 𝑖𝑗 (where 𝑒 𝑖𝑗 denotes payoff in cell (i,j)). Example So, player A can expect to gain at least vL =2 Player B can expect to lose at most vM =3. B1 B2 vL A1 1 4 A2 3 2 2* vM 3*

The Concept of Equilibrium (Pure Strategies): Saddle Points In the case that the value of the game is vL = vM , a saddle point is any cell whose payoff is this value. Example Playing A1 => payoff of at least 0 Playing B2 => payoff of no more than 0 There is a unique saddle point – cell (A1,B2). If either player deviates from this, then they will do worse. Here, vL = vM = 0. B1 B2 vL A1 2 0* A2 4 -2 vM It is the simplest form of equilibrium.

When there is no Saddle Point Consider again the following payoff matrix: We have seen above that the value of the game lies between 2 and 3. But, if player A always plays A2, then B can always play B2 and the payoff is 2, whereas is player A always plays A1, then B1 can always play B1, yielding 1, which is less than 2! Can player A gain more than 2…? Yes, because the game is of imperfect information – players don’t know each other’s move, but this means that we should not be predictable. B1 B2 A1 1 4 A2 3 2

Simplification using Dominance For larger matrices, we may often simplify. The main technique for simplification is to compare pairs of columns, C and C’, say, and delete those columns where the payoff in C is always greater than that in C’ or vice versa. In this case we say C dominates C’. (Similarly for rows). Thus, B4 dominates B1, B3 and B5, yielding : This matrix yields a saddle point corresponding to the moves A2 and B4, with value of the game=3. B1 B2 B3 B4 B5 A1 4 5 6 1 A2 3 Remember that the payoffs are given for player A and signs must be reversed when evaluated for Player B. B2 B4 A1 5 1 A2 3 3*

Simplification using Dominance: Demonstration in Gambit notes: (i) right click col/row label deletes that col/row – to add rows, click on table icon next to avatar (ii) to resize columns and rows, drag towards right of cell

Mixed Strategies: Expectation 1/2 Scenario: Game is played repeatedly. In this case choosing the same pure strategy is not always optimal, so we can vary these pure strategies. To determine how we vary the strategies, we can apply probability theory. Key concept is Expectation := the product of the probability of the occurrence of an event and the value associated with the occurrence of a given event. A player can use a mixed strategy – this is more than one pure strategy, where each pure strategy is played randomly according to a fixed probability yielding an expected payoff.

Mixed Strategies: Expectation 1/2 We then can determine the expected value of a game. Formally, as before, we denote the moves available to Players A and B as A1,A2, … ,An and B1,B2, … ,Bm respectively. Suppose the moves in A’s mixed strategy are played with probabilities x = (x1, x2, …, xn); and for B, y = (y1, y2, …, ym). Suppose the payoffs are given by 𝑒 𝑖𝑗 . Then the game’s expected value for A is E(x), where 𝐸 𝒙 = 𝑖,𝑗 𝑥 𝑖 𝑦 𝑗 𝑒 𝑖𝑗 The expectation is the same for Player B.

Mixed Strategies: Expectation: Examples The moves available to Player A are A1,A2 and to Player B: B1,B2. Suppose the moves in A’s mixed strategy are played with probabilities x = (x1, x2); and for B, y = (y1, y2). Suppose the payoffs are given by 𝑒 𝑖𝑗 . Then the game’s expected value, E(x)=x1y1*1 + x1y2*4 + x2y1 *3 + x2y2*2 So what should the values be for x and y … ? B1 B2 A1 1 4 A2 3 2

Minimax: The Concept of Equilibrium for Mixed Strategies Minimax – one of the key theories developed by Von Neumann and Morgenstern originally defined this only for zero-sum games Rationale: whatever the other player does, this return is assured on average. Theorem. In a two-person zero-sum game where player A has n strategies and player B has m strategies (where n and m are finite), then the minimax value of the game, v, is given by: v = max 𝑥∈𝑋 min 𝑦∈𝑌 𝑒(𝐱,𝒚) = max 𝑦∈𝑌 min 𝑥∈𝑋 𝑒(𝒙,𝒚) (The saddle point is a special case where xi = 1 for some i, yj=1 for some j.) Thus the solution is to play moves in fixed proportion x and the value can be determined by simply considering the expectation against any single move.

Minimax: Determination of the Mixed Strategies Determination of x and y Determine if there are any saddle points. If found then we have the solution and can stop here. Remove all dominated imputations (rows/columns), leaving a payoff matrix M. For the two players, solve Mx = v and MTy = v respectively, where v is a vector where each entry is v, the value of the game. (MT is the transpose of M)

Player B: 1 4 3 2 𝑥′ 𝑦′ = 𝑣 𝑣 Player A: 1 3 4 2 𝑥 𝑦 = 𝑣 𝑣 Minimax Example (1/4) 𝑀= 1 4 3 2 , 𝑀 𝑇 = 1 3 4 2 There are no saddle points, and no cases of dominance. Player B: 1 4 3 2 𝑥′ 𝑦′ = 𝑣 𝑣 Player A: 1 3 4 2 𝑥 𝑦 = 𝑣 𝑣 B1 B2 A1 1 4 A2 3 2

Minimax Example (2/4): Player A’s mixed strategy Let x:= the probability Player A plays A1 Let y:= the probability Player A plays A1 Then x+y=1. 1*x + 3*y = v 4*x + 2*y = v Therefore, from (1), x=v-3y. Substitute in (2) to give: 4(v-3y)+2y=v. Therefore, 3v=10y. Hence, 3x = 3(v-3y) = 10y-9y = y. Therefore x=0.25, y=0.75 and v=2.5 B1 B2 A1 1 4 A2 3 2

Minimax: Example (3/4): Player B’s mixed strategy Hence, for player B: Let x’:= the probability Player B plays B1 Let y’:= the probability Player B plays B1 Then x’+y’ = 1. 1*x’ + 4* y’ = 2.5 3*x’ + 2* y’ = 2.5 Therefore, 3(2.5-4y’)+2y’ = 2.5 Hence, 5 = 10y’ => y’ = 0.5. Therefore x’ = 0.5 B1 B2 A1 1 4 A2 3 2

Minimax Example (4/4) – use of Gambit Gambit provides modelling of games in normal form – Gambit calls them “strategic games”. In the screenshot, each cell has a pair of payoffs - the first is what Player A receives, the second is what Player B receives. (Gambit is designed for non-zero-sum games – see later sections). It can compute the expected value and the corresponding equilibria mixed strategies of the two players. In the file menu select: Tools -> Equilibrium and then (‘Computer all Nash equilibria’, ‘with Gambit’s recommended method).

Minimax Limitations Whilst the Minimax theorem provides a solution, it’s macro- oriented, i.e. not sensitive to individual variations. Thus It ensures an average payoff Assumes repeated play and is a result that is more reliable the more times played In practice, it takes no account of the strategy of the opponent – even if they keep playing the same pure strategy, the expected return is no more, no less… The optimisation reflects a collective philosophy that markets find their natural level.

PART D: Non-Zero-Sum Games

An Overview of Non-Zero-Sum Games [Recap] A game is said to be non-zero-sum if wealth may be created or destroyed among the players (i.e. the total wealth can increase or decrease). In general, unlike for zero-sum games, in non-zero-sum games, wealth can be mutually created through cooperation. Cooperation may be achieved whether or not there is direct communication. Where there is no communication, information is necessarily imperfect. Where there is communication, there may be bargaining.

Analysis of Non-Zero-Sum Games Methods of mathematical logical, such as use of induction, are effective for determining strategies in Zero-sum games with perfect information. However they are less so for games of imperfect information, and are often not applicable to non-zero games. IF some assumptions are made THEN some mathematical techniques may be effectively applied. Prerequisites: Understand the environment, understand the individual and collective psychology (Thus we are moving from the domain of pure mathematics to embrace social sciences, particularly psychology and economics.)

Utility Payoffs are given as utility – the perceived worth of something Utility is a key concept and is determined by social and psychological factors. They depend upon personal preferences The same material payoff may have different utility (In economics, personal preference is often reckoned in terms of ranking a selection of consumer offerings. [Economic] agents are said to be “rational” if this ranking system is complete.)

Utility – Example (Exercise) Which would you choose? (Game is only played once!) 10 million Yen 100% chance 100 million Yen 20% chance

Utility – Example (Analysis) Expected return option (1) = 10 million yen, Expected return option (2) = 20 million yen, But option (1) has already great utility – utility curve may be logarithmic Here, if you have many friends playing or many attempts, then you should go for option 2. This is similar to philosophy of ‘penny shares’ – small investment, unlikely to succeed, but if it succeeds then it could be very successful.

Analytical Approaches to Non-Zero-Sum Games As before, the mathematical approaches use linear algebra, matrices, and probability theory. Hence the basic Concepts in Non-Zero-Sum Games: One-off vs Repeated games Payoff matrix Expectation Strategies – pure and mixed However, the generation of appropriate models requires Social Science tools that take account of the psychology of human behaviour, individual and collective; the analysis of markets, negotiation and bargaining.

Introducing The Prisoner’s Dilemma Description: Two men suspected of committing a bank robbery together and are arrested by the police. They are placed in separate cells, so cannot communicate. Each suspect may either confess or remain silent. They know the consequences of their actions. Suppose we call them Player A and Player B: If A confesses, but B remains silent, then A “turns Queens Evidence” [UK] and goes free, whilst the other goes to prison for 10 years If both A and B confess, then they go to prison for 5 years. If both A and B remain silent, then they go to prison for 1 year for carrying concealed weapons. This is a famous problem that was originally formulated by A.W. Tucker

The Prisoner’s Dilemma: Payoff Matrix Non-zero-sum games of normal form may be represented by a payoff matrix, where each cell is an n-tuple, a set of payoffs, 1 for each player. Thus for the Prisoner’s Dilemma, a 2-person game, we have pairs of payoffs. If A1 denotes ‘Player A remains silent’, A2 denotes ‘Player A confesses’ (similarly for B), then we can represent the problem by the following matrix: B1 B2 A1 (-1,-1) (-10,0) A2 (0,-10) (-5,-5)

The Prisoner’s Dilemma: Strategy Player A reasons as follows: If Player B chooses B1, then I am better off choosing A2 (because O > -1). If Player B chooses B2, then I am better off choosing A2 (because -5 > -10). Similarly, for player B. Hence A2,B2 are selected. In fact, this reflects accepted theory: John Nash extended the minimax result of zero-sum games to non-zero-sum games. Informally, it states that a pair of mixed strategies is in Nash equilibrium, if it means that any unilateral (one- sided) deviation for either player would yield a payoff that was no more than the value of the pair. (A2,B2) are in equilibrium. Payoffs are (-5.-5) <demo: prisoners2by2.gmb>

Mixed Strategies for Non-zero-sum Games: Nash Equilibrium As mentioned above, John Nash’s theorem states that a pair of mixed strategies is in equilibrium if any unilateral (one-sided) deviation for either player would yield a payoff that was no more than the value of the pair. Formally, Definition. A pair of strategies, x*∈X, y*∈Y is an equilibrium pair for a non- zero-sum game if for any x∈X and y∈Y, eA(x,y*) ≤ e(x*,y*) and eB(x*,y) ≤ e(x*,y*) , where eA is player A’s payoff and eB is player B’s payoff. Theorem. Any two-person (zero-sum or non-zero-sum) with a finite number of pure strategies has at least one equilibrium pair. (Such a pair is called a Nash Equilibrium pair. Determining the solution is not trivial.)

The Prisoner’s Dilemma: Paradox Paradox: both players confess and spend 5 years in prison, whereas if they had remained silent they would have spent 1 year each in prison! Diagnosis: the unilateral view is not optimal. A bilateral (two-sided) view – involving cooperation – would suggest the other move for both players. This is covered by the notion of strategies being pareto optimal – when there is no other strategy in which both players are at least as well off.

The Repeated Prisoner’s Dilemma: Web demos There are many online versions of the Prisoner’s Dilemma. See e.g. Lessons from the Prisoner’s Dilemma: An interactive tutorial by Martin Poulter, April 2003, Economics Network http://www.economicsnetwork.ac.uk/archive/poulter/pd.htm

The Prisoner’s Dilemma: Applications (1) What is it useful for? Usefulness usually determined by consideration of repeated games… Lessons for military (consider safety of the citizens of two rival powers: which is safer? If they both disarm (cooperative strategy)? Or if they are both heavily armed? Marketing strategies – if two rival companies both offer small discounts then they may receive many customers and retain a good market share. What if they offer huge discounts?

The Prisoner’s Dilemma: Applications (2) “In economics as in other realms of the prisoner's dilemma, success requires a willingness not to measure oneself against any one opponent. ''You do tend to compare yourself to other people,'' Dr. Hauser said. ''However, it turns out that if I do that I'm hurting myself very badly.'' Biological Applications” “In real life, that is, does cooperation depend on an internal sense of morality? Or does it depend on the complicated dynamics of environments where people challenge each other, betray each other and trust each other over and over again?” NY Times, PRISONER'S DILEMMA HAS UNEXPECTED APPLICATIONS By JAMES GLEICK Published: June 17, 1986 Comments: It’s the difference between considering “what’s best for me, regardless” and “what’s best for everyone”. In practice, there may be a great difference in behaviour between playing this game once vs. many times.

The Battle of the Sexes W1 W2 H1 (4,1) (0,0) H2 (1,4) Suppose that a newlywed couple are both planning an outing at the weekend. They haven’t yet decided what to do. The husband would like to watch football, whereas the wife would like to go to a concert, but they would both prefer to be in the company of their spouses rather than go their separate ways. Suppose option 1 is football and option 2 is concert. Then the payoff matrix may look like this: W1 W2 H1 (4,1) (0,0) H2 (1,4)

The Battle of the Sexes: Equilibria (Gambit) Gambit can calculate the equilibria and gives 3 of them: Two of the three are indicating cooperation

The Battle of the Sexes: Modelling in Maxima Maxima can be used to plot regions. Suppose the husband chooses to play option H1 with probability x. Therefore they play option H2 with probability 1-x. Similarly, the wife plays option W1 with probability y and option W2 with probability 1-y. We then can define expectation for each player as functions E1 and E2 respectively in variables x and y: E1:=(4*xy + 0*x(1-y)) + (0*(1-x)y + 1*(1-x)(1-y)) E2:=(1*xy + 0*x(1-y)) + (0*(1-x)y + 4*(1-x)(1-y)) Hence E1=5xy –x-y+1 and E2= 5xy+-4x-4y+xy

The Battle of the Sexes: Cooperation: Maxima Graphs 1/2 We can carry out a parametric plot that shows the expectations along the x-axis and y-axis respectively. Thus this is actually a 2D plot in two parameters(x,y). However, Maxima only allows one parameter for 2D plots. Thus we need to use a 3D plot, and simply set z to be a constant E1(x,y):=5xy-4y-4x+4 - think of this as the x-axis E2(x,y):=5xy-x-y+1 - think of this as the y-axis Z:=0 – any value will be fine 0<=x<=1, 0<=y<=1

The Battle of the Sexes: Cooperation: Maxima Graphs 2/2 The resulting graph looks like:

The Battle of the Sexes: Cooperation: Maxima Graphs: convex closure 1/2 Here, complete cooperation means that always the payoffs are either (1,4) or (4,1). There may be some decision rule that randomises this, e.g. a toss of a coin. In this case, we simply denote the probability of the first option by x (0 <= x <=1): E1:= 4x + 1-x = 1+3x E2:= x + 4(1-x) = 4-3x We can plot this on the same graph and compare…

The Battle of the Sexes: Cooperation: Maxima Graphs: convex closure 2/2 The result is simply a straight line joining the points (1,4) and (4,1). The expected values lie between 1 and 4 for both players. It pays to cooperate!

Conclusions Games occur in many life situations Mathematical analysis requires understanding of the context and rules Games played repeatedly yield different responses from games played only once. Human psychology often yields unexpected behaviour.