Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.

Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

World Series of Poker

Poker Research Group - core Darse Billings (Ph.D.) Aaron Davidson M.Sc., Poki Neil Burch P/A, PsOpti Terence Schauenberg (M.Sc.), Adapti Advisors: J Schaeffer, D Szafron

Poker Research Group – new arrivals Bret Hoehn (M.Sc.) Finnegan Southey (postdoc) Michael Bowling Dale Schuurmans Rich Sutton Robert Holte

Our Goal

PsOpti2 vs. “theCount”

Play Us Online http://games.cs.ualberta.ca/poker/

Poki’s Poker Academy http://poki-poker.com

Poker Variants Many different variants of poker Texas Hold’em the most skill-testing No-Limit Texas Hold’em used to determine the world champion Our research: Limit Texas Hold’em Current focus: 2-player (heads up)

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of 19 45 9 of 19 44 17,296 19 Bet Sequence O(10 18 ) 2-player, limit, Texas Hold’em 2 private cards to each player 3 community cards 1 community card

Research Issues 1. Chance events 2. Imperfect Information 3. Sheer size of the game tree 4. Opponent modelling is crucial 5. How best to use domain knowledge ? 6. Experimental method Variants have even more challenges: – More than 2 players (up to 10) – “No limit” (bid any amount)

Issues: Chance Events Utility of outcomes – currently just reason about expected payoff – short-term vs. long-term High variance – was the outcome due to luck or skill ? – experiment design

Issues: Imperfect Information Probabilistic strategies are essential Cannot construct your strategy in a bottom-up manner, as is done with perfect information games

Issues: Size of the game 2-player, Limit, Texas Hold’em game tree has about 10 18 states Linear Programming can solve games with 10 8 states

Issues: Opponent Modelling Nash equilibrium not good enough – Static – Defensive Even the best humans have weaknesses that should be exploited How to learn very quickly, with very noisy information ? – Expoitation vs. exploration How not to be exploited yourself ?

Issues: Using Expert Knowledge We are fortunate to have unlimited access to a poker-playing expert (Darse) How best to use his knowledge ? – Expert system (explicitly encoded knowledge) was not effective – Used his knowledge to devise abstractions that reduced the game size with minimal impact on strategic aspects of the game – Use him to evaluate the system

Experimental Method High variance ‘bot play not the same as human play Very limited access to expert humans other than our own expert

Coping with very large games Full game tree T Strategy For T Strategy For T* Abstract game tree T* abstraction Solve (LP) (reverse mapping) (lossy) too big to solve

Abstraction Texas Hold'em 2-player game tree is too big for current LP –solvers (1,179,000,604,565,715,751) Many ways of doing the abstractions – We require coarse-grained abstractions – Avoiding a severe loss of accuracy Abstract to a set of smaller problems  10 8 states,  10 6 equations and unknowns

Alternate Game Structures Truncation of betting rounds Bypassing betting rounds Models with 3 rounds, 2 rounds, or 1 round Many-to-one mapping of game-tree nodes to single nodes in the abstract game tree – How you do the mapping determines the overall accuracy (few good and many bad mappings) – This is the limiting factor of the method

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of 19 45 9 of 19 44 17,296 19 Bet Sequence Texas Hold'em O(10 18 ) 3-round Model (expected value leaf nodes)

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of 19 45 9 of 19 44 17,296 19 Bet Sequence Texas Hold'em O(10 18 ) 3-round Postflop Model (single flop) 1-round Preflop Model

Abstractions BoardQ – 7  – 2 Compare 1.A –3 2.A –4 3.A –K – Suit isomorphism (  24X) (exact) – Rank near-equivalence (small error) Bucketing Hands are mapped to a small set of buckets depending on Current hand strength Potential for improvement in hand strength

Bucketing Reduce branching factor at chance nodes Partition hands into six classes per player Overlaying strategically similar sub-trees 1,1 1,21,36,6 1,1 1,21,3.… Original Bucketing Next Round Bucketing Transition Probabilities …. 6,6

Bet Sequence Initial Flop Bet Sequence Turn Bet Sequence River 1,624,350 9 of 19 45 9 of 19 44 17,296 w 2 (36) 7 of 15 19 Bet Sequence 15 x 2 (36) z 2 (36) y 2 (36) Texas Hold'em O(10 18 ) Abstract Postflop Model O(10 7 ) Abstract Preflop Model O(10 7 )

Reverse Mapping Bucket splitting – LP solution gives a strategy (recipe) – Each partition class split strong / weak – Split the randomized mixed strategy – {0, 0.2, 0.8} => {0, 0, 1.0} & {0, 0.4, 0.6} Better hand selection (with some risk)

Putting It All Together – PsOpti1 Bets 2468 Preflop Flop Turn River Selby preflop model Post

Putting It All Together – PsOpti2 Preflop Flop Turn River Bets + model 3-round preflop model Post 2446688

Conclusions Game Theory can be applied to large problems and practical systems Nash Equilibrium (minimax) too defensive, does not exploit the opponent’s weaknesses Current work involves opponent modelling – Preliminary results are very promising We hope to beat the best poker players in the world in the near future

Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.

Similar presentations

Presentation on theme: "Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta.

Similar presentations

Presentation on theme: "Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta."— Presentation transcript:

Similar presentations

About project

Feedback