Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.

Slides:



Advertisements
Similar presentations
An Introduction to Game Theory Part V: Extensive Games with Perfect Information Bernhard Nebel.
Advertisements

Tuomas Sandholm, Andrew Gilpin Lossless Abstraction of Imperfect Information Games Presentation : B 趙峻甫 B 蔡旻光 B 駱家淮 B 李政緯.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Randomized Strategies and Temporal Difference Learning in Poker Michael Oder April 4, 2002 Advisor: Dr. David Mutchler.
Games & Adversarial Search
Adversarial Search Chapter 5.
Mathematics and the Game of Poker
INFORMS 2006, Pittsburgh, November 8, 2006 © 2006 M. A. Zinkevich, AICML 1 Games, Optimization, and Online Algorithms Martin Zinkevich University of Alberta.
Intelligence for Games and Puzzles1 Poker: Opponent Modelling Early AI work on poker used simplified.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker Andrew Gilpin and Tuomas Sandholm Carnegie.
A competitive Texas Hold’em poker player via automated abstraction and real-time equilibrium computation Andrew Gilpin and Tuomas Sandholm Carnegie Mellon.
Presenter: Robert Holte. 2 Helping the world understand … and make informed decisions. * * Potential beneficiaries: commercial games companies, and their.
Lectures in Microeconomics-Charles W. Upton Minimax Strategies.
Intro to Probability & Games
Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings …) Presented by Brett Borghetti 7 Jan 2007.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker * Andrew Gilpin and Tuomas Sandholm, CMU,
Games & Adversarial Search Chapter 6 Section 1 – 4.
Computing equilibria in extensive form games Andrew Gilpin Advanced AI – April 7, 2005.
Minimax Strategies. Everyone who has studied a game like poker knows the importance of mixing strategies. –With a bad hand, you often fold –But you must.
Finding equilibria in large sequential games of imperfect information Andrew Gilpin and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Poki: The Poker Agent Greg Priebe Zak Knudson. Overview Texas Hold’em poker Architecture and Opponent Modeling of Poki Improvements from past Poki Betting.
Models of Strategic Deficiency and Poker Workflow Inference: What to do with One Example and no Semantics.
Reinforcement Learning in the Presence of Hidden States Andrew Howard Andrew Arnold {ah679
Poker and AI How the most “stable” creature on earth got used to that good old game from the west!
Texas Holdem Poker With Q-Learning. First Round (pre-flop) PlayerOpponent.
Overview Odds Pot Odds Outs Probability to Hit an Out
Chapter 9 Games with Imperfect Information Bayesian Games.
Advanced Artificial Intelligence Lecture 3B: Game theory.
Game Playing.
Introduction for Rotarians
NearOptimalGamblingAdive Matt Morgis Peter Chapman Mitch McCann Temple University.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Games with Imperfect Information Bayesian Games. Complete versus Incomplete Information So far we have assumed that players hold the correct belief about.
SARTRE: System Overview A Case-Based Agent for Two-Player Texas Hold'em Jonathan Rubin & Ian Watson University of Auckland Game AI Group
Oklahoma’s Personal Financial Literacy Passport © Oklahoma State Department of Education. All rights reserved. 1 Teacher Presentation Series 12 Standard.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
INTELLIGENT SYSTEM FOR PLAYING TAROK
The challenge of poker NDHU CSIE AI Lab 羅仲耘. 2004/11/04the challenge of poker2 Outline Introduction Texas Hold’em rules Poki’s architecture Betting Strategy.
Poker as a Testbed for Machine Intelligence Research By Darse Billings, Dennis Papp, Jonathan Schaeffer, Duane Szafron Presented By:- Debraj Manna Gada.
Neural Network Implementation of Poker AI
Part 3 Linear Programming
Lecture 12. Game theory So far we discussed: roulette and blackjack Roulette: – Outcomes completely independent and random – Very little strategy (even.
Texas Hold’em Playing Styles Team 4 Matt Darryl Alex.
Better automated abstraction techniques for imperfect information games Andrew Gilpin and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Artificial Neural Networks And Texas Hold’em ECE 539 Final Project December 19, 2003 Andy Schultz.
By: John Cook 11/06/2009 PTTE John Cook 3/4/2016.
The Mathematics of Poker– Implied Pot Odds Strategy: No-Limit.
Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.
Finding equilibria in large sequential games of imperfect information CMU Theory Lunch – November 9, 2005 (joint work with Tuomas Sandholm)
Texas Holdem A Poker Variant vs. Flop TurnRiver. How to Play Everyone is dealt 2 cards face down (Hole Cards) 5 Community Cards Best 5-Card Hand Wins.
John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center with a lot of slides from Tuomas Sandholm Copyright 2015 Poker.
Extensive-Form Game Abstraction with Bounds
Lecture 13.
Stochastic tree search and stochastic games
Game Theory Just last week:
Computing equilibria in extensive form games
Extensive-form games and how to solve them
Games & Adversarial Search
Games & Adversarial Search
Strategies for Poker AI player
Lecture 12.
Econ 805 Advanced Micro Theory 1
Games & Adversarial Search
Games & Adversarial Search
Instructor: Vincent Conitzer
Games & Adversarial Search
Games & Adversarial Search
HOW TO PLAY POKER.
Finding equilibria in large sequential games of imperfect information
Presentation transcript:

Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker

World Series of Poker

Poker Research Group - core Darse Billings (Ph.D.) Aaron Davidson M.Sc., Poki Neil Burch P/A, PsOpti Terence Schauenberg (M.Sc.), Adapti Advisors: J Schaeffer, D Szafron

Poker Research Group – new arrivals Bret Hoehn (M.Sc.) Finnegan Southey (postdoc) Michael Bowling Dale Schuurmans Rich Sutton Robert Holte

Our Goal

PsOpti2 vs. “theCount”

Play Us Online

Poki’s Poker Academy

Poker Variants Many different variants of poker Texas Hold’em the most skill-testing No-Limit Texas Hold’em used to determine the world champion Our research: Limit Texas Hold’em Current focus: 2-player (heads up)

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of of , Bet Sequence O(10 18 ) 2-player, limit, Texas Hold’em 2 private cards to each player 3 community cards 1 community card

Research Issues 1. Chance events 2. Imperfect Information 3. Sheer size of the game tree 4. Opponent modelling is crucial 5. How best to use domain knowledge ? 6. Experimental method Variants have even more challenges: – More than 2 players (up to 10) – “No limit” (bid any amount)

Issues: Chance Events Utility of outcomes – currently just reason about expected payoff – short-term vs. long-term High variance – was the outcome due to luck or skill ? – experiment design

Issues: Imperfect Information Probabilistic strategies are essential Cannot construct your strategy in a bottom-up manner, as is done with perfect information games

Issues: Size of the game 2-player, Limit, Texas Hold’em game tree has about states Linear Programming can solve games with 10 8 states

Issues: Opponent Modelling Nash equilibrium not good enough – Static – Defensive Even the best humans have weaknesses that should be exploited How to learn very quickly, with very noisy information ? – Expoitation vs. exploration How not to be exploited yourself ?

Issues: Using Expert Knowledge We are fortunate to have unlimited access to a poker-playing expert (Darse) How best to use his knowledge ? – Expert system (explicitly encoded knowledge) was not effective – Used his knowledge to devise abstractions that reduced the game size with minimal impact on strategic aspects of the game – Use him to evaluate the system

Experimental Method High variance ‘bot play not the same as human play Very limited access to expert humans other than our own expert

Coping with very large games Full game tree T Strategy For T Strategy For T* Abstract game tree T* abstraction Solve (LP) (reverse mapping) (lossy) too big to solve

Abstraction Texas Hold'em 2-player game tree is too big for current LP –solvers (1,179,000,604,565,715,751) Many ways of doing the abstractions – We require coarse-grained abstractions – Avoiding a severe loss of accuracy Abstract to a set of smaller problems  10 8 states,  10 6 equations and unknowns

Alternate Game Structures Truncation of betting rounds Bypassing betting rounds Models with 3 rounds, 2 rounds, or 1 round Many-to-one mapping of game-tree nodes to single nodes in the abstract game tree – How you do the mapping determines the overall accuracy (few good and many bad mappings) – This is the limiting factor of the method

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of of , Bet Sequence Texas Hold'em O(10 18 ) 3-round Model (expected value leaf nodes)

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of of , Bet Sequence Texas Hold'em O(10 18 ) 3-round Postflop Model (single flop) 1-round Preflop Model

Abstractions BoardQ – 7  – 2 Compare 1.A –3 2.A –4 3.A –K – Suit isomorphism (  24X) (exact) – Rank near-equivalence (small error) Bucketing Hands are mapped to a small set of buckets depending on Current hand strength Potential for improvement in hand strength

Bucketing Reduce branching factor at chance nodes Partition hands into six classes per player Overlaying strategically similar sub-trees 1,1 1,21,36,6 1,1 1,21,3.… Original Bucketing Next Round Bucketing Transition Probabilities …. 6,6

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of of ,296 w 2 (36) 7 of Bet Sequence 15 x 2 (36) z 2 (36) y 2 (36) Texas Hold'em O(10 18 ) Abstract Postflop Model O(10 7 ) Abstract Preflop Model O(10 7 )

Reverse Mapping Bucket splitting – LP solution gives a strategy (recipe) – Each partition class split strong / weak – Split the randomized mixed strategy – {0, 0.2, 0.8} => {0, 0, 1.0} & {0, 0.4, 0.6} Better hand selection (with some risk)

Putting It All Together – PsOpti1 Bets 2468 Preflop Flop Turn River Selby preflop model Post

Putting It All Together – PsOpti2 Preflop Flop Turn River Bets + model 3-round preflop model Post

Conclusions Game Theory can be applied to large problems and practical systems Nash Equilibrium (minimax) too defensive, does not exploit the opponent’s weaknesses Current work involves opponent modelling – Preliminary results are very promising We hope to beat the best poker players in the world in the near future