Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007.

Similar presentations


Presentation on theme: "Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007."— Presentation transcript:

1 Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007

2 Outline Part I: Computer GO, a flavor taste Part II: Learn to Predict Move

3 Part I Computer GO a flavor taste

4 The Game of GO 19 by 19 grid Two players (black and white) Place stone at intersections of lines Object: maximize surrounding territories, or control a larger part of the board than your opponent

5 Why GO is special? Very simple rule Global winning NO two games the same Handicap system Current best program: handtalk, unable to beat experienced amateur GameComputer- Human CheckersChinook > H Othello Logistello > H Chess Deep Blue >= H GoHandtalk << H

6 Complexity results Polynomial-space hard [Lichtenstein & Sipser 80] Exponential-time complete [Robson 83]

7 Major approaches Tree Search based (minimax, alpha-beta) –Handtalk, GO++, GNU GO Monte-Carlo methods –Select best from random play Learning based –Neural network [Enzenberger 96] –SVM –Graphical model [Stern et al 06]

8 Search in Computer GO Tree search –Pattern matching –Heuristics, expert rules –Local search –Early stop –Alpha-beta pruning (which is successful for chess) –High-level abstract strategies

9 Major challenges Too many possible moves (361) How to evaluate a move (subtlety) Implicit control vs explicit control Connectivity viewpoint Local state vs global state

10

11 Local vs Global

12

13 Part II Learn to Predict Move

14 Move Prediction in the Learning Setting Given: –a database of professional games Goal: –learn the distribution of a move given the current board state –rank the moves Assumption: –Experts always make best moves

15 Learning features State explosion with full board –Full board state: configuration (c) –2 361 possibilities Local State (t): –Local pattern (within a region of size 64) –Plus 8 extra features on liveness (situation)

16 Local pattern region

17 Local Liveness Features Liberties of new chain: 1, 2, 3, >3 Liberties of opponent: 1, 2, 3, >3 Is there an active Ko? Is new chain captured immediately? Distance to board edge: 5

18 Move Distribtuion Model Move v, given configuration c – – u as a prior value of pattern, Normal(μ, σ) –Latent value of move, x|u ~ Normal(u, β) –Pick the move with the largest latent value: Learning posterior by sum-product algorithm

19 Results Real data: –181,000 game records –600 million patterns (with prune) 34% of expert moves ranked first –86% in top 20 –can be used to score or rank during search

20 Test in the real game Opening: rather good Weaker in later stages –missing pattern details –global state needed

21 Some ideas for discussion Iterative Search and Learning –Learning for move ranking/prediction –Use ranking to score in search tree –Search result as new data for learning Learn local region/global strategy –learn abstract strategy (e.g. fight, defense) –Group move sequence together?

22 Other questions Can computer learn from non-expert human? Can computer learn to play by playing with self?

23 References Graepel et al, Learning on graphs in the game of go, 2001 Stern et al, Modeling uncertainty in the game of go, 2004 Stern et al, Bayesian pattern ranking for move prediction in the game of go, 2006 Bouzy et al: Computer Go: an AI Oriented Survey, 2001

24 Thanks!

25 Combinatorial game theory

26 Some key features Live and death Number of eyes Liberty


Download ppt "Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007."

Similar presentations


Ads by Google