By James Mannion Computer Systems Lab Period 3

By James Mannion Computer Systems Lab 08-09 Period 3
The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Programme By James Mannion Computer Systems Lab 08-09 Period 3

Abstract Searching through large sets of data Complex, vast domains
Heuristic searches Chess Evaluation Function Machine Learning

Introduction Simple domains, simple heuristics The domain of chess
Deep Blue – brute force Looking at 30^6 moves before making the first Supercomputer Too many calculations Not efficient

Introduction (cont’d)
Minimax search Alpha-beta pruning Only look 2-3 moves into the future Estimate strength of position Evaluation function Can improve heuristic by learning

Introduction (cont’d)
Seems simple, but can become quite complex. Chess masters spend careers learning how to “evaluate” moves Purpose: can a computer learn a good evaluation function?

Background Claude Shannon, 1950 Brute force would take too long
Discusses evaluation function 2-ply algorithm, but looks further into the future for moves that could lead to checkmate Possibility of learning in distant future

Development Python Stage 1: Text based chess game
Two humans input their moves Illegal moves not allowed

Development (cont’d)

Development (cont’d) Stage 2: Introduce a computer player 2-3 ply
Evaluation function will start out such that choices are based on a simple piece- differential where each piece is waited equally

Development (cont’d) Stage 3: Learning Temporal Difference Learning
Weight adjustment: w_i < − − w_i + a((n_ic − n_ip)/(n_ic)) Heuristic function: h = c_1(p_1) + c_2(p_2) + c_3(p_3) + c_4(p_4) + c_5(p_5) Piece values: p-i = Sum(w_i) – Sum(b_i) over i

Testing Learning vs No Learning
Two equal, piece-differential players pitted against each other. One will have the ability to learn Thousands of games Win-loss differential tracked over the length of the test By the end, the learner should be winning significantly more games.

Data (cont'd)

References Shannon, Claude. “Programming a Computer for Playing Chess.” 1950 Beal, D.F., Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 Moriarty, David E., Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” Huang, Shiu-li, Lin, Fu-ren. “Using Temporal- Difference Learning for Multi-Agent Bargaining.” 2007 Russell, Stuart, Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition Asgharbeygi, Nima, Stracuzzi, David and Langley, Pat.“Relational Temporal Difference Learning”.

By James Mannion Computer Systems Lab Period 3

Similar presentations

Presentation on theme: "By James Mannion Computer Systems Lab Period 3"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

By James Mannion Computer Systems Lab Period 3

Similar presentations

Presentation on theme: "By James Mannion Computer Systems Lab Period 3"— Presentation transcript:

Similar presentations

About project

Feedback