Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation Jinzhuo Wang, WenminWang, Ronggang Wang, Wen Gao.

Slides:

Advertisements

Similar presentations

A New Evolving Tree for Text Document Clustering and Visualization 1 Wui Lee Chang, 1* Kai Meng Tay, 2 Chee Peng Lim 1 Faculty of Engineering, Universiti.

Advertisements

Artificial Intelligence for Games Game playing Patrick Olivier

Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.

Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University.

Upper Confidence Trees for Game AI Chahine Koleejan.

Computer Go : A Go player Rohit Gurjar CS365 Project Proposal, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.

Machine Learning for an Artificial Intelligence Playing Tic-Tac-Toe Computer Systems Lab 2005 By Rachel Miller.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

WEEK INTRODUCTION IT440 ARTIFICIAL INTELLIGENCE.

Cheng-Lung Huang Mu-Chen Chen Chieh-Jen Wang

Introduction to Machine Learning © Roni Rosenfeld,

AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!

ConvNets for Image Classification

Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.

Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.

Introduction to Machine Learning, its potential usage in network area,

Brief Intro to Machine Learning CS539

Attention Model in NLP Jichuan ZENG.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Wenchi MA CV Group EECS,KU 03/20/2017

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

Stochastic tree search and stochastic games

Deeply learned face representations are sparse, selective, and robust

Machine Learning for Big Data

An Artificial Intelligence Approach to Precision Oncology

Status Report on Machine Learning

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Saliency-guided Video Classification via Adaptively weighted learning

I Know it’s an Idiot But it’s MY artificial idiot!

A Pool of Deep Models for Event Recognition

Mastering the game of Go with deep neural network and tree search

AlphaGo with Deep RL Alpha GO.

Combining CNN with RNN for scene labeling (segmentation)

Artificial Intelligence and Society

Status Report on Machine Learning

Videos NYT Video: DeepMind's alphaGo: Match 4 Summary: see 11 min.

Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .

AlphaGo and learning methods

Deep reinforcement learning

AlphaGO from Google DeepMind in 2016, beat human grandmasters

School of Computing Science

AlphaGo and learning methods

A note given in BCC class on March 15, 2016

Machine Learning: The Connectionist

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

Training Neural networks to play checkers

Fenglong Ma1, Jing Gao1, Qiuling Suo1

Basic Intro Tutorial on Machine Learning and Data Mining

Distributed Representation of Words, Sentences and Paragraphs

Introduction to Neural Networks

Deep Learning Tutorial

CSE 573 Introduction to Artificial Intelligence Neural Networks

John H.L. Hansen & Taufiq Al Babba Hasan

Future of Artificial Intelligence

Designing Neural Network Architectures Using Reinforcement Learning

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Christoph F. Eick: A Gentle Introduction to Machine Learning

Heterogeneous convolutional neural networks for visual recognition

Artificial Intelligence and Future of Education

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Artificial Intelligence in Medicine

Neural Architecture Search: Basic Approach, Acceleration and Tricks

Adversarial Search, Game Playing

These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like.

Introduction to Artificial Intelligence

Learning and Memorization

RHEA Enhancements for GVGP

Prabhas Chongstitvatana Chulalongkorn University

Artificial Intelligence Machine Learning

Presentation transcript:

Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation Jinzhuo Wang, WenminWang, Ronggang Wang, Wen Gao School of Electronics and Computer Engineering, Peking University Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-17)

Outline Introduction of Computer Go Motivation The Proposed System Monte Carlo Tree Search How Experts Play Standard CNNs for Computer Go The Proposed System Evaluation Discussion

Introduction of Computer Go Solution evolutions Human knowledge + Rules -> only Rules

Introduction of Computer Go The key problem is Move Prediction How to transfer move prediction problem to machine learning setting?

Introduction of Computer Go Transfer move prediction to machine learning algorithm Define {x, y} data set Establish mapping function Learning A match with n steps has n-1 training data x y

Introduction of Computer Go Win-rate against open source engines [1] Gelly, S., and Silver, D. 2011. Monte-carlo tree search and rapid action value estimation in computer go. Artificial Intelligence 175(11):1856–1875. [2] Enzenberger, M.; M¨uller, M.; Arneson, B.; and Segal, R. 2010. Fuego- an open-source framework for board games and go engine based on monte carlo tree search. Computational Intelligence and AI in Games, IEEE Transactions on 2(4):259–270. [3] Baudiˇs, P., and Gailly, J.-l. 2011. Pachi: State of the art open source go program. In Advances in Computer Games. 24–38.

Motivation Most existing algorithms are based on Monte Carlo Tree Search (MCTS) Browne, C. B.; Powley, E.; Whitehouse, D.; Lucas, S. M.; Cowling, P. I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; and Colton, S. 2012. A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on 4(1):1–43.

Motivation Limitation of MCTS MCTS will fail in many situations e.g. simultaneous fights including two-safe-group problem; coexistence in seki MCTS considers large proportion of meaningless moves This situation is severe especially in the opening Huang, S.-C., and M¨uller, M. 2013. Investigating the limits of monte-carlo tree search methods in computer go. In Computers and Games. 39–48.

Motivation How Experts Play Experts often consider next move with a long-term consideration They barely use naïve violent search like MTCS Sutskever, I., and Nair, V. 2008. Mimicking go experts with convolutional neural networks. In International Conference on Artificial Neural Networks. 101–110.

Motivation Limitation of Standard CNNs for Computer Go They often need more than 10 layers to achieve competitive results Compared with popular image-based tasks, Go board has a much smaller size of 19*19 Contexts are important, especially for local fights [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

The Proposed System The proposed system mainly contains two parts Deep alternative neural network for move prediction Long-term evaluation for comprehensive consideration

The Proposed System Deep alternative neural network for move prediction Jinzhuo Wang, Wenmin Wang, Xiongtao Chen, Ronggang Wang, Wen Gao: Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition. NIPS 2016: 811-819

The Proposed System Deep alternative neural network for move prediction Advantages over standard CNNs [1][2][3] Context collection (1) in early layer (2) rich hierarchies Increasing depth while consuming only additional constantly parameters Multiple path in the feed-forward propagation [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

The Proposed System Long-term evaluation for comprehensive consideration Mnih, Volodymyr, Nicolas Heess, and Alex Graves. "Recurrent models of visual attention." In NIPS, pp. 2204-2212. 2014.

The Proposed System Final decision for next move We define the final score for each candidate using the criteria of both DANN and long-term evaluation. We select the action of the highest score of S as the final choice of our system, where p is the probability produced by the softmax layer of DANN.

Evaluation Evaluation metric contains two parts Accuracy (top-n) of next move prediction Win rate against open source engines

Evaluation Setup and dataset GoGoD (82k prof) PGD (253k prof proposed) KGS is not used (amateur)

Evaluation Model investigation (move prediction) Ablation experiments of DANN Comparison with standard DCNN

Evaluation Model investigation (move prediction) Impact of long-term evaluation Top-1 Accuracy GoGoD PGD DANN+LTE 53.6 51.4 DANN+MCTS 48.9 49.3 DCNN[1]+LTE 51.2 53.5 DCNN[1]+MCTS 45.7 46.2 [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774.

Evaluation Win-rate against open source engines and comparison [1] Clark, C., and Storkey, A. 2015. Training deep convolutional neural networks to play go. In ICML, 1766–1774. [2] Maddison, C. J.; Huang, A.; Sutskever, I.; and Silver, D. 2015. Move evaluation in go using deep convolutional neural networks. In ICLR [3] Tian, Y., and Zhu, Y. 2016. Better computer go player with neural network and long-term prediction. In ICLR

Discussion While this work is doing, AlphaGo beaten Fan Hui with a 5-0, but has not played with Lee Sedol At that time, we believe human experts can provide guidance Lee Sedol and Ke Ji are defeated … We need to re-think what human experts can provide when writing Go programs I still believe human experts can provide something somehow

Thank you !