Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation Jinzhuo Wang, WenminWang, Ronggang Wang, Wen Gao.

Slides:



Advertisements
Similar presentations
A New Evolving Tree for Text Document Clustering and Visualization 1 Wui Lee Chang, 1* Kai Meng Tay, 2 Chee Peng Lim 1 Faculty of Engineering, Universiti.
Advertisements

Artificial Intelligence for Games Game playing Patrick Olivier
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University.
Upper Confidence Trees for Game AI Chahine Koleejan.
Computer Go : A Go player Rohit Gurjar CS365 Project Proposal, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
Machine Learning for an Artificial Intelligence Playing Tic-Tac-Toe Computer Systems Lab 2005 By Rachel Miller.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
WEEK INTRODUCTION IT440 ARTIFICIAL INTELLIGENCE.
Cheng-Lung Huang Mu-Chen Chen Chieh-Jen Wang
Introduction to Machine Learning © Roni Rosenfeld,
AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!
ConvNets for Image Classification
Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.
Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.
Introduction to Machine Learning, its potential usage in network area,
Brief Intro to Machine Learning CS539
Attention Model in NLP Jichuan ZENG.
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Wenchi MA CV Group EECS,KU 03/20/2017
CNN-RNN: A Unified Framework for Multi-label Image Classification
Stochastic tree search and stochastic games
Deeply learned face representations are sparse, selective, and robust
Machine Learning for Big Data
An Artificial Intelligence Approach to Precision Oncology
Status Report on Machine Learning
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Saliency-guided Video Classification via Adaptively weighted learning
I Know it’s an Idiot But it’s MY artificial idiot!
A Pool of Deep Models for Event Recognition
Mastering the game of Go with deep neural network and tree search
AlphaGo with Deep RL Alpha GO.
Combining CNN with RNN for scene labeling (segmentation)
Artificial Intelligence and Society
Status Report on Machine Learning
Videos NYT Video: DeepMind's alphaGo: Match 4 Summary: see 11 min.
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
AlphaGo and learning methods
Deep reinforcement learning
AlphaGO from Google DeepMind in 2016, beat human grandmasters
School of Computing Science
AlphaGo and learning methods
A note given in BCC class on March 15, 2016
Machine Learning: The Connectionist
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
Training Neural networks to play checkers
Fenglong Ma1, Jing Gao1, Qiuling Suo1
Basic Intro Tutorial on Machine Learning and Data Mining
Distributed Representation of Words, Sentences and Paragraphs
Introduction to Neural Networks
Deep Learning Tutorial
CSE 573 Introduction to Artificial Intelligence Neural Networks
John H.L. Hansen & Taufiq Al Babba Hasan
Future of Artificial Intelligence
Designing Neural Network Architectures Using Reinforcement Learning
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Christoph F. Eick: A Gentle Introduction to Machine Learning
Heterogeneous convolutional neural networks for visual recognition
Artificial Intelligence and Future of Education
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Artificial Intelligence in Medicine
Neural Architecture Search: Basic Approach, Acceleration and Tricks
Adversarial Search, Game Playing
These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like.
Introduction to Artificial Intelligence
Learning and Memorization
RHEA Enhancements for GVGP
Prabhas Chongstitvatana Chulalongkorn University
Artificial Intelligence Machine Learning
Presentation transcript:

Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation Jinzhuo Wang, WenminWang, Ronggang Wang, Wen Gao School of Electronics and Computer Engineering, Peking University Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-17)

Outline Introduction of Computer Go Motivation The Proposed System Monte Carlo Tree Search How Experts Play Standard CNNs for Computer Go The Proposed System Evaluation Discussion

Introduction of Computer Go Solution evolutions Human knowledge + Rules -> only Rules

Introduction of Computer Go The key problem is Move Prediction How to transfer move prediction problem to machine learning setting?

Introduction of Computer Go Transfer move prediction to machine learning algorithm Define {x, y} data set Establish mapping function Learning A match with n steps has n-1 training data x y

Introduction of Computer Go Win-rate against open source engines [1] Gelly, S., and Silver, D. 2011. Monte-carlo tree search and rapid action value estimation in computer go. Artificial Intelligence 175(11):1856–1875. [2] Enzenberger, M.; M¨uller, M.; Arneson, B.; and Segal, R. 2010. Fuego- an open-source framework for board games and go engine based on monte carlo tree search. Computational Intelligence and AI in Games, IEEE Transactions on 2(4):259–270. [3] Baudiˇs, P., and Gailly, J.-l. 2011. Pachi: State of the art open source go program. In Advances in Computer Games. 24–38.

Motivation Most existing algorithms are based on Monte Carlo Tree Search (MCTS) Browne, C. B.; Powley, E.; Whitehouse, D.; Lucas, S. M.; Cowling, P. I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; and Colton, S. 2012. A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on 4(1):1–43.

Motivation Limitation of MCTS MCTS will fail in many situations e.g. simultaneous fights including two-safe-group problem; coexistence in seki MCTS considers large proportion of meaningless moves This situation is severe especially in the opening Huang, S.-C., and M¨uller, M. 2013. Investigating the limits of monte-carlo tree search methods in computer go. In Computers and Games. 39–48.

Motivation How Experts Play Experts often consider next move with a long-term consideration They barely use naïve violent search like MTCS Sutskever, I., and Nair, V. 2008. Mimicking go experts with convolutional neural networks. In International Conference on Artificial Neural Networks. 101–110.

Motivation Limitation of Standard CNNs for Computer Go They often need more than 10 layers to achieve competitive results Compared with popular image-based tasks, Go board has a much smaller size of 19*19 Contexts are important, especially for local fights [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

The Proposed System The proposed system mainly contains two parts Deep alternative neural network for move prediction Long-term evaluation for comprehensive consideration

The Proposed System Deep alternative neural network for move prediction Jinzhuo Wang, Wenmin Wang, Xiongtao Chen, Ronggang Wang, Wen Gao: Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition. NIPS 2016: 811-819

The Proposed System Deep alternative neural network for move prediction Advantages over standard CNNs [1][2][3] Context collection (1) in early layer (2) rich hierarchies Increasing depth while consuming only additional constantly parameters Multiple path in the feed-forward propagation [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

The Proposed System Long-term evaluation for comprehensive consideration Mnih, Volodymyr, Nicolas Heess, and Alex Graves. "Recurrent models of visual attention." In NIPS, pp. 2204-2212. 2014.

The Proposed System Final decision for next move We define the final score for each candidate using the criteria of both DANN and long-term evaluation. We select the action of the highest score of S as the final choice of our system, where p is the probability produced by the softmax layer of DANN.

Evaluation Evaluation metric contains two parts Accuracy (top-n) of next move prediction Win rate against open source engines

Evaluation Setup and dataset GoGoD (82k prof) PGD (253k prof proposed) KGS is not used (amateur)

Evaluation Model investigation (move prediction) Ablation experiments of DANN Comparison with standard DCNN

Evaluation Model investigation (move prediction) Impact of long-term evaluation Top-1 Accuracy GoGoD PGD DANN+LTE 53.6 51.4 DANN+MCTS 48.9 49.3 DCNN[1]+LTE 51.2 53.5 DCNN[1]+MCTS 45.7 46.2 [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774.

Evaluation Win-rate against open source engines and comparison [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

Discussion While this work is doing, AlphaGo beaten Fan Hui with a 5-0, but has not played with Lee Sedol At that time, we believe human experts can provide guidance Lee Sedol and Ke Ji are defeated … We need to re-think what human experts can provide when writing Go programs I still believe human experts can provide something somehow

Thank you !