Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move.

Learning Shape in Computer Go David Silver

A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move The aim is to surround the most territory Usually played on 19x19 board

Capturing The lines radiating from a stone are called liberties If a connected group of stones has all of its liberties removed then it is captured Captured stones are removed from the board

Atari Go (Capture Go) Atari Go is a simplified version of Go The winner is the first player to capture Often used to teach Go to beginners Circumvents several tricky issues  The game only finishing by agreement  Ko (local repetitions of position)  Seki (local stalemates)

Computer Go Computer Go programs are very weak  Search space is too large for brute force techniques  No good evaluation functions Human intuition (shape knowledge) has proven difficult to capture. Why not learn shape knowledge? And use it to learn an evaluation function?

Local shape Local shape describes a pattern of stones It is used extensively by current Computer Go programs (pattern databases) Inputting local shape by hand takes many years of hard labour We would like to:  Learn local shapes by trial and error  Assign a value for the goodness of a shape  Just how good is a particular shape?

Enumerating local shapes In these experiments all possible local shapes are used as features Up to a small maximum size (e.g. 2x2) A local shape is defined to be:  A particular configuration of stones  At a canonical position on the board Local shapes are used as binary features by the learning algorithm

Invariances Each canonical local shape can be:  Rotated  Reflected  Inverted So each position may cause updates to multiple instances of each feature.

Algorithm Value function is learnt for afterstates Move selection is done by 1-ply greedy search (ε = 0) over value function  Active local shapes are identified  Linear combination is taken  Sigmoid squashing function is applied Backups are performed using TD(0) Reward of +1 for winning, 0 for losing

Value function approximation

Training procedure The challenge:  Learn to beat the average liberty player So learning algorithm was trained specifically against the average liberty player The problem: learning is very slow, since the agent almost never wins any games by chance. The solution: mix in a proportion of random moves until the agent wins 50% of all games. Reduce the proportion of randomness as the agent learns to win more games.

Training procedure The two pint challenge:  Learn to beat the average liberty player So learning algorithm was trained specifically against the average liberty player The problem: learning is very slow, since the agent almost never wins any games by chance. The solution: mix in a proportion of random moves until the agent wins 50% of all games. Reduce the proportion of randomness as the agent learns to win more games.

Results for different shape sizes

Results for different board sizes

Shapes learned (1x1)

Conclusions Local shape information is sufficient to beat a naïve rule-based player Significant shapes can be learned The ‘goodness’ of shapes can be learned A linear threshold unit can provide a reasonable evaluation function Enumerating all local shapes reaches a natural limit at 3x3 Training methodology is crucial

Future work Learn shapes selectively rather than enumerating all possible shapes Learn shapes to answer specific questions  Can black B4 be captured?  Can white connect A2 to D5? Learn non-local shape:  Use connectivity relationships  Build hierarchies of shapes

Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move.

Similar presentations

Presentation on theme: "Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move.

Similar presentations

Presentation on theme: "Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move."— Presentation transcript:

Similar presentations

About project

Feedback