Chapter 6 : Game Search 게임 탐색 (Adversarial Search)

Slides:



Advertisements
Similar presentations
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Advertisements

Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)
This lecture topic: Game-Playing & Adversarial Search
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
Game Playing Games require different search procedures. Basically they are based on generate and test philosophy. At one end, generator generates entire.
Tic Tac Toe Architecture CSE 5290 – Artificial Intelligence 06/13/2011 Christopher Hepler.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Mahgul Gulzai Moomal Umer Rabail Hafeez
This time: Outline Game playing The minimax algorithm
Chapter 6 : Game Search 게임 탐색 (Adversarial Search)
Game Playing CSC361 AI CSC361: Game Playing.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Monotonicity Admissible Search: “That finds the shortest path to the Goal” Monotonicity: local admissibility is called MONOTONICITY This property ensures.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
ICS-270a:Notes 5: 1 Notes 5: Game-Playing ICS 270a Winter 2003.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Game Playing.
AD FOR GAMES Lecture 4. M INIMAX AND A LPHA -B ETA R EDUCTION Borrows from Spring 2006 CS 440 Lecture Slides.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Chapter 12 Adversarial Search. (c) 2000, 2001 SNU CSE Biointelligence Lab2 Two-Agent Games (1) Idealized Setting  The actions of the agents are interleaved.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
CSCI 4310 Lecture 6: Adversarial Tree Search. Book Winston Chapter 6.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Game-playing AIs Part 2 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part II  The Minimax Rule  Alpha-Beta Pruning  Game-playing.
CMSC 421: Intro to Artificial Intelligence October 6, 2003 Lecture 7: Games Professor: Bonnie J. Dorr TA: Nate Waisbrot.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Search: Games & Adversarial Search Artificial Intelligence CMSC January 28, 2003.
Adversarial Search and Game-Playing
ADVERSARIAL GAME SEARCH: Min-Max Search
Announcements Homework 1 Full assignment posted..
Games and Adversarial Search
Game-Playing & Adversarial Search
Last time: search strategies
Iterative Deepening A*
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Pengantar Kecerdasan Buatan
David Kauchak CS52 – Spring 2016
Game Playing in AI by: Gaurav Phapale 05 IT 6010
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.
Artificial Intelligence
Artificial Intelligence Chapter 12 Adversarial Search
Game playing.
NIM - a two person game n objects are in one pile
Artificial Intelligence
The Alpha-Beta Procedure
Introduction to Artificial Intelligence Lecture 9: Two-Player Games I
Search and Game Playing
Minimax strategies, alpha beta pruning
Adversarial Search and Game Playing
Game Playing Fifth Lecture 2019/4/11.
Game Playing: Adversarial Search
CSE (c) S. Tanimoto, 2007 Search 2: AlphaBeta Pruning
Based on slides by: Rob Powers Ian Gent
Game Playing Chapter 5.
Adversarial Search and Game Playing Examples
Adversarial Search CS 171/271 (Chapter 6)
Games & Adversarial Search
Minimax strategies, alpha beta pruning
Unit II Game Playing.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning
Presentation transcript:

Chapter 6 : Game Search 게임 탐색 (Adversarial Search)

게임 탐색의 특성 - Exhaustive search is almost impossible. ==> mostly too many branching factor and too many depths. (e.g., 바둑: (18 * 18)! ), 체스:DeepBlue ? - 정적평가점수(Static evaluation score) ==> board quality - maximizing player ==> hoping to win (me) minimizing player ==> hoping to lose (enemy) - Game tree ==> is a semantic tree with node (board configuration) and branch (moves). original board state new board state new board state

Minimax Game Search Idea: take maximum score at maximizing level (my turn). take minimum score at minimizing level (your turn). maximizing level 2 7 1 8 2 7 1 8 2 1 minimizing level maximizing level 상대는? 나는? 2 2 7 1 8 “this move gurantees best”

최소최대 탐색 예 평가함수 값을 최대화 시키려는 최대화자 A의 탐색 최소화자 B의 의도를 고려한 A의 선택 [c1] [c2] f=0.8 [c2] f=0.3 [c3] f=-0.2 A 최소화자(B) 단계 [c1] f=0.8 [c2] f=0.3 [c3] f=-0.2 [c11] f=0.9 [c12] f=0.1 [c13] f=-0.6 [c21] f=0.1 [c22] f=-0.7 [c31] f=-0.1 [c32] f=-0.3

Minimax Algorithm Function MINIMAX( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state) return the action corresponding with value v Function MAX-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v = - for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s ) ) return v Function MIN-VALUE( state ) returns a utility value v =  v = MIN( v, MAX-VALUE( s ) )

Minimax Example MAX node MIN node 3 12 8 2 4 6 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Minimax Example MAX node MIN node 3 12 8 2 4 6 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Minimax Example MAX node MIN node 3 2 2 3 12 8 2 4 6 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Minimax Example MAX node 3 MIN node 3 2 2 3 12 8 2 4 6 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Tic-Tac-Toe Tic-tac-toe, also called noughts and crosses (in the British Commonwealth countries) and X's and O's in the Republic of Ireland, is a pencil-and-paper game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The X player usually goes first. The player who succeeds in placing three respective marks in a horizontal, vertical, or diagonal row wins the game. The following example game is won by the first player, X:

Save time

Game tree (2-player) How do we search this tree to find the optimal move? 14

Applying MiniMax to tic-tac-toe The static heuristic evaluation function

a-b Pruning Idea: 탐색 공간을 줄인다! (mini-max  지수적으로 폭발) a-b principle: “if you have an idea that is surely bad, do not take time to see how truly bad it is.” =2 >=2 Max >=2 a-cut Min =2 =2 <=1 2 7 2 7 2 7 1

최대화 노드에서 가능한 최소의 값(알파 )과 최소화의 노드에서 가능한 최대의 값(베타 )를 사용한 게임 탐색법 알파베타 가지치기 최대화 노드에서 가능한 최소의 값(알파 )과 최소화의 노드에서 가능한 최대의 값(베타 )를 사용한 게임 탐색법 기본적으로 DFS로 탐색 진행 [c0] =0.2 [c1] f=0.2 [c2] f= -0.1 a-cut [c11] f=0.2 [c12] f=0.7 [c21] f= -0.1 [c22] [c23] C21의 평가값 -0.1이 C2에 올려지면 나머지 노드들(C22, C23)을 더 이상 탐색할 필요가 없음

Tic-Tac-Toe Example with Alpha-Beta Pruning Backup Values

a-b Procedure a never decrease (initially - infinite) -∞ b never increase (initially infinite) +∞ - Search rule: 1. a-cutoff ==> cut when below any minimizing node that have a b <= a (ancestor). 2, b-cutoff ==> cut when below any maximizing node that have a a >= b (ancestor).

Example max a-cut min max b-cut min 90 89 100 99 60 59 75 74

Alpha-Beta Pruning Algorithm Function ALPHA-BETA( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state, -, +) return the action corresponding with value v Function MAX-VALUE( state, ,  ) returns a utility value , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = - for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s, ,  ) ) if v >=  then return v  = MAX(, v ) return v

Alpha-Beta Pruning Algorithm Function MIN-VALUE( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = + for a, s in SUCCESSORS( state ) do v = MIN( v, MAX-VALUE( s, ,  ) ) if v <=  then return v  = MIN( , v ) return v

Alpha-Beta Pruning Example , , initial values =−  =+ , , passed to kids =−  =+ The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example =−  =+ =−  =3 MIN updates , based on kids 3 The nodes are “MAX nodes” The nodes are “MIN nodes” 25 25

Alpha-Beta Pruning Example [-, + ] [-, 3] 3 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [-, + ] [-, 3] =−  =3 MIN updates , based on kids. No change. 3 12 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] MAX updates , based on kids. [-, 3] 3 is returned as node value. 3 3 12 8 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] [-, 3] 3 12 8 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] , , passed to kids [-, 3] [3,+] =3  =+ 3 12 8 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] MIN updates , based on kids. [-, 3] [3, 2] =3  =2 X X  ≥ , so prune. 3 12 8 2 The nodes are “MAX nodes” The nodes are “MIN nodes” 31 31

Alpha-Beta Pruning Example MAX updates , based on kids. No change. [3, +] 2 is returned as node value. [-, 3] [3, 2] X X 3 12 8 2 The nodes are “MAX nodes” The nodes are “MIN nodes” 32 32

Alpha-Beta Pruning Example [3,+] , , passed to kids [-, 3] [3, 2] =3  =+ X X 3 12 8 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] MIN updates , based on kids. [-, 3] [3, 2] [3, 14] =3  =14 X X 3 12 8 2 14 The nodes are “MAX nodes” The nodes are “MIN nodes” 34 34

Alpha-Beta Pruning Example [3, +] MIN updates , based on kids. [-, 3] [3, 2] [3, 5] =3  =5 X X 3 12 8 2 14 5 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] MIN updates , based on kids. [-, 3] [3, 2] [3, 2] =3  =2 X X 3 12 8 2 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example [3, +] 2 is returned as node value. [-, 3] [3, 2] [3, 2] X X 3 12 8 2 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes”

Alpha-Beta Pruning Example MAX updates , based on kids. No change. [3, +] [-, 3] [3, 2] [3, 2] X X 3 12 8 2 14 5 2 The nodes are “MAX nodes” The nodes are “MIN nodes” 38 38

Example -which nodes can be pruned? 3 4 1 2 7 8 5 6 Max Min 40

Answer to Example 3 4 1 2 7 8 5 6 41 -which nodes can be pruned? Max Min Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side). 41

Second Example (the exact mirror image of the first example) -which nodes can be pruned? 4 3 6 5 8 7 2 1 42

Answer to Second Example (the exact mirror image of the first example) -which nodes can be pruned? 6 5 8 7 2 1 3 4 Min Max Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side). 43

점진적 심화방법 Time limits  unlikely to find goal, must approximate

Iterative (Progressive) Deepening :점진적 심화방법 In real games, there is usually a time limit T on making a move How do we take this into account? using alpha-beta we cannot use “partial” results with any confidence unless the full breadth of the tree has been searched So, we could be conservative and set a conservative depth-limit which guarantees that we will find a move in time < T disadvantage is that we may finish early, could do more search In practice, iterative deepening search (IDS) is used IDS runs depth-first search with an increasing depth-limit when the clock runs out we use the solution found at the previous depth limit

점진적 심화방법

Iterative deepening search l =0 49

Iterative deepening search l =1 50

Iterative deepening search l =2 51

Iterative deepening search l =3 52

Heuristic Continuation: fight horizon effect