The Pentium Goes to Vegas Training a Neural Network to Play BlackJack Paul Ruvolo and Christine Spritke.

Slides:



Advertisements
Similar presentations
Pattern Association.
Advertisements

By: Jonathan Quenzer. To have a computer learn how to play Blackjack through reinforcement learning Computer starts off with no memory. After each hand.
Randomized Strategies and Temporal Difference Learning in Poker Michael Oder April 4, 2002 Advisor: Dr. David Mutchler.
Playing Tic Tac Toe with Neural Networks Justin Herbrand CS/ECE/ME 539.
2002/11/15Game Programming Workshop1 A Neural Network for Evaluating King Danger in Shogi Reijer Grimbergen Department of Information Science Saga University.
Lecture 10. Simplified roulette European roulette has numbers 0,1,…36. xD0s
Card Counting What is it, how does it work, and can I use it to pay for college? (and if it does work, do I even have to go to college?) Jeff O’Connell.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
BLACKJACK SHARKS: Aabida Mayet, Omar Bacerra, Eser Kaptan March 15, 2010.
Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:
Dice, Cards and Darts Probabilities in Games Finding the Right Representation.
An application of the Probability Theory
Rene Plowden Joseph Libby. Improving the profit margin by optimizing the win ratio through the use of various strategies and algorithmic computations.
Computer Science 313 – Advanced Programming Topics.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
The back-propagation training algorithm
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Centinel tournament ● A deck: the numbers in random order ● A game lasts until no numbers are left in deck ● A game is played like this (first player.
Blackjack as a Test Bed for Learning Strategies in Neural Networks A. Perez-Uribe and E. Sanchez Swiss Federal Institute of Technology IEEE IJCNN'98.
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
CS510 AI and Games Final Report on Dec Juncao Li.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Chapter 6: Multilayer Neural Networks
Chapter 5 Black Jack. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. 5-2 Chapter Objectives Provide a case study example from problem statement.
VOCABULARY  Deck or pack  Suit  Hearts  Clubs  Diamonds  Spades  Dealer  Shuffle  Pick up  Rank  Draw  Set  Joker  Jack 
Neural Networks Slides by Megan Vasta. Neural Networks Biological approach to AI Developed in 1943 Comprised of one or more layers of neurons Several.
Black Jack Dr. Bernard Chen University of Central Arkansas Spring 2012.
Blackjack: An Analysis of Probability By: John Theobald.
April 2009 BEATING BLACKJACK CARD COUNTING FEASIBILITY ANALYSIS THROUGH SIMULATION.
Casinos There’s a reason they are big and extravagant!
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Multiple-Layer Networks and Backpropagation Algorithms
Artificial Neural Networks
Jorge Munoz, German Gutierrez, Araceli Sanchis Presented by: Itay Bittan.
Appendix B: An Example of Back-propagation algorithm
Learning BlackJack with ANN (Aritificial Neural Network) Ip Kei Sam ID:
Back-Propagation MLP Neural Network Optimizer ECE 539 Andrew Beckwith.
Blackjack: A Beatable Game Amber Guo Adapted from: David Parker Advisor: Dr. Wyels California Lutheran University ‘05.
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 21 Oct 28, 2005 Nanjing University of Science & Technology.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Blackjack Betting and Playing Strategies: A Statistical Comparison By Jared Luffman MSIM /3/2007.
1 Phase II - Checkers Operator: Eric Bengfort Temporal Status: End of Week Five Location: Phase Two Presentation Systems Check: Checkers Checksum Passed.
Intelligent Systems in the Gambling Industry Kieran O’Neill 25/03/10.
BACKPROPAGATION: An Example of Supervised Learning One useful network is feed-forward network (often trained using the backpropagation algorithm) called.
Multi-Layer Perceptron
Neural Network Implementation of Poker AI
Prediction of the Foreign Exchange Market Using Classifying Neural Network Doug Moll Chad Zeman.
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Week 8 : User-Defined Objects (Simple Blackjack Game)
Chapter 6 Neural Network.
1 Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
1 Neural Networks MUMT 611 Philippe Zaborowski April 2005.
Multiple-Layer Networks and Backpropagation Algorithms
The Gradient Descent Algorithm
Chapter 5 Black Jack.
Lecture 10.
CSE 473 Introduction to Artificial Intelligence Neural Networks
Lecture 12.
Classification / Regression Neural Networks 2
Prof. Carolina Ruiz Department of Computer Science
Artificial Neural Network & Backpropagation Algorithm
BACKPROPAGATION Multlayer Network.
network of simple neuron-like computing elements
Neural Networks Chapter 5
Prof. Carolina Ruiz Department of Computer Science
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

The Pentium Goes to Vegas Training a Neural Network to Play BlackJack Paul Ruvolo and Christine Spritke

Goals Investigate result based learning Develop strategy for a highly random game Train network to play effectively without explicitly teaching the rules of the game

Strategy Simplify game to only allow for HIT or STAY Feedforward 3-layer backpropagation network – Give input units information about the hand and the dealer’s up card – 2 output units for HIT and STAY – 1 hidden layer Measure performance with Efficiency – Efficiency = (win % * 2) + (tie %) Return on a dollar

Background

To form a basis of comparison we measured efficiency on a player using: – Random Guessing Efficiency = 60.3% – Dealer’s Algorithm Hit when below 17, otherwise Stay Efficiency = 92.2%

PHASE I Input Specific Cards Showing

PHASE I – Network Setup 104 Input Units – 52 input units for possible cards in player’s hand – 52 input units for possible dealer’s up card 20 Hidden Units 2 Output Units – HIT and STAY Learning Rate = 0.3; Momentum = 0.3

PHASE I – Network Setup Target High = 0.9 Target Low = 0.1 Target Mid = 0.5 If hitting and staying yield same result – HIT = STAY = Target Mid If hitting produces a win while staying produces a loss – HIT = Target High – STAY = Target Low Vice versa

PHASE I – Results Efficiency peaks at about 88% but never settles

PHASE I – Modifications Tried multiple variations on initial network – Hidden units ranging from 1 to 20 – Learning rate and momentum adjustments Aging algorithm for learning rate – 20 Input Units 10 possible values for player’s cards 10 possible values for dealer’s up card No significant changes in performance

PHASE I - Analysis Network hits on a hand summing to 21 Analyzed why the network can’t improve, or even learn the dealer’s algorithm

PHASE II Input “best” sum of current hand

PHASE II – Strategy 4 types of inputs – No dealer card, no ace differentiation – No dealer card, with ace differentiation – Include dealer card, no ace differentiation – Include dealer card, with ace differentiation All use 2 output units and 4 hidden units

PHASE II – No dealer, no aces 18 input units – Represent all possible hand values when making a decision (ranging from 4 to 21) Results: – Develops the dealer’s algorithm Hits on sum < 17 Stays on sum > 16

PHASE II – No dealer, aces

PHASE II – Dealer, no aces 28 input units – 18 possible player hand values – 10 possible values for dealer’s up card Results: – High efficiency – Good at accounting for dealer’s card in boundary cases

PHASE II – Dealer, no aces

Network is more likely to stay when the dealer has a bust card

PHASE II – Dealer, aces 38 input units – 28 units for player’s hand 18 possible hard hand values 10 possible soft hand values – 10 units for the dealer’s up card Results: – Good at adjusting strategy for hard vs. soft hands

PHASE II – Dealer, aces Network always hits a soft 17 and stays on a hard 17

Conclusion Neural networks are not magical! Require the teacher to eliminate duplicate patterns – 5 of diamonds + 7 of clubs is equivalent to 8 of hearts + 4 of spades Result based training is inherently more difficult 2 hidden layers might help – We’re not optimistic!