Chapter 6 : Game Search 게임 탐색 (Adversarial Search)

Slides:



Advertisements
Similar presentations
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Advertisements

This lecture topic: Game-Playing & Adversarial Search
For Friday Finish chapter 5 Program 1, Milestone 1 due.
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
CMSC 671 Fall 2001 Class #8 – Thursday, September 27.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
Game Playing Games require different search procedures. Basically they are based on generate and test philosophy. At one end, generator generates entire.
Adversarial Search Chapter 6 Section 1 – 4.
Game-Playing & Adversarial Search
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Mahgul Gulzai Moomal Umer Rabail Hafeez
This time: Outline Game playing The minimax algorithm
Lecture 02 – Part C Game Playing: Adversarial Search
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
CS 561, Sessions Administrativia Assignment 1 due tuesday 9/24/2002 BEFORE midnight Midterm exam 10/10/2002.
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
CS 561, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
Game Playing CSC361 AI CSC361: Game Playing.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen.
CS 460, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Monotonicity Admissible Search: “That finds the shortest path to the Goal” Monotonicity: local admissibility is called MONOTONICITY This property ensures.
Adversarial Search: Game Playing Reading: Chess paper.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
ICS-270a:Notes 5: 1 Notes 5: Game-Playing ICS 270a Winter 2003.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Game Playing.
AD FOR GAMES Lecture 4. M INIMAX AND A LPHA -B ETA R EDUCTION Borrows from Spring 2006 CS 440 Lecture Slides.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
October 3, 2012Introduction to Artificial Intelligence Lecture 9: Two-Player Games 1 Iterative Deepening A* Algorithm A* has memory demands that increase.
Chapter 12 Adversarial Search. (c) 2000, 2001 SNU CSE Biointelligence Lab2 Two-Agent Games (1) Idealized Setting  The actions of the agents are interleaved.
Chapter 6 Adversarial Search. Adversarial Search Problem Initial State Initial State Successor Function Successor Function Terminal Test Terminal Test.
Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,
Notes on Game Playing by Yun Peng of theYun Peng University of Maryland Baltimore County.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Game Playing. Introduction One of the earliest areas in artificial intelligence is game playing. Two-person zero-sum game. Games for which the state space.
CSCI 4310 Lecture 6: Adversarial Tree Search. Book Winston Chapter 6.
CSE473 Winter /04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.
Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games.
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Game-playing AIs Part 2 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part II  The Minimax Rule  Alpha-Beta Pruning  Game-playing.
Graph Search II GAM 376 Robin Burke. Outline Homework #3 Graph search review DFS, BFS A* search Iterative beam search IA* search Search in turn-based.
CMSC 421: Intro to Artificial Intelligence October 6, 2003 Lecture 7: Games Professor: Bonnie J. Dorr TA: Nate Waisbrot.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Adversarial Search In this lecture, we introduce a new search scenario: game playing 1.two players, 2.zero-sum game, (win-lose, lose-win, draw) 3.perfect.
Chapter 5: Adversarial Search & Game Playing
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Search: Games & Adversarial Search Artificial Intelligence CMSC January 28, 2003.
1 Chapter 6 Game Playing. 2 Chapter 6 Contents l Game Trees l Assumptions l Static evaluation functions l Searching game trees l Minimax l Bounded lookahead.
Adversarial Search Chapter Two-Agent Games (1) Idealized Setting – The actions of the agents are interleaved. Example – Grid-Space World – Two.
Adversarial Search and Game-Playing
Games and Adversarial Search
Game-Playing & Adversarial Search
Last time: search strategies
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Pengantar Kecerdasan Buatan
Game Playing in AI by: Gaurav Phapale 05 IT 6010
Chapter 6 : Game Search 게임 탐색 (Adversarial Search)
Introduction to Artificial Intelligence Lecture 9: Two-Player Games I
Minimax strategies, alpha beta pruning
Game Playing Fifth Lecture 2019/4/11.
Minimax strategies, alpha beta pruning
Unit II Game Playing.
Presentation transcript:

Chapter 6 : Game Search 게임 탐색 (Adversarial Search)

original board state new board state new board state 게임 탐색의 특성 - Exhaustive search is almost impossible. ==> mostly too many branching factor and too many depths. (e.g., 바둑 : (18 * 18)! ), 체스 :DeepBlue ? - 정적평가점수 (Static evaluation score) ==> board quality - maximizing player ==> hoping to win (me) minimizing player ==> hoping to lose (enemy) - Game tree ==> is a semantic tree with node (board configuration) and branch (moves).

Minimax Game Search Idea: take maximum score at maximizing level (my turn). take minimum score at minimizing level (your turn) maximizing level minimizing level maximizing level 나는 ? 상대는 ? “this move gurantees best”

4 최소최대 탐색 예 평가함수 값을 최대화 시키려는 최대화자 A 의 탐색 최소화자 B 의 의도를 고려한 A 의 선택 A [c 1 ] f=0.8 [c 2 ] f=0.3 [c 3 ] f=-0.2 A [c 1 ] f=0.8 [c 2 ] f=0.3 [c 3 ] f=-0.2 [c 11 ] f=0.9 [c 12 ] f=0.1 [c 13 ] f=-0.6 [c 21 ] f=0.1 [c 22 ] f=-0.7 [c 31 ] f=-0.1 [c 32 ] f=-0.3 최소화자 (B) 단계

5 Minimax Algorithm Function MINIMAX( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state) return the action corresponding with value v Function MAX-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v = -  for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s ) ) return v Function MIN-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v =  for a, s in SUCCESSORS( state ) do v = MIN( v, MAX-VALUE( s ) ) return v

6 Minimax Example The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node

7 Minimax Example The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node

8 Minimax Example The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node

9 Minimax Example The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node

Tic-Tac-Toe Tic-tac-toe, also called noughts and crosses (in the British Commonwealth countries) and X's and O's in the Republic of Ireland, is a pencil-and-paper game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The X player usually goes first. The player who succeeds in placing three respective marks in a horizontal, vertical, or diagonal row wins the game.British Commonwealthpencil-and-paper game The following example game is won by the first player, X:

Save time

Game tree (2-player) How do we search this tree to find the optimal move?

Applying MiniMax to tic-tac-toe The static heuristic evaluation function

 -  Pruning Idea: 탐색 공간을 줄인다 ! (mini-max  지수적으로 폭발 )  -  principle: “if you have an idea that is surely bad, do not take time to see how truly bad it is.” 27 >=2 =2 27 >=2 =2 271 <=1  -cut Max Min

알파베타 가지치기 – 최대화 노드에서 가능한 최소의 값 ( 알파  ) 과 최소화의 노드에서 가능한 최대의 값 ( 베타  ) 를 사용한 게임 탐색법 – 기본적으로 DFS 로 탐색 진행 [c 0 ]  =0.2 [c 2 ] f= -0.1 [c 1 ] f=0.2 [c 21 ] f= -0.1 [c 22 ][c 23 ] [c 11 ] f=0.2 [c 12 ] f=0.7 C 21 의 평가값 -0.1 이 C 2 에 올려지면 나머지 노드들 (C 22, C 23 ) 을 더 이상 탐색할 필요가 없음  -cut

Tic-Tac-Toe Example with Alpha-Beta Pruning Backup Values

 -  Procedure  never decrease (initially - infinite) -∞  never increase (initially infinite) +∞ - Search rule: 1.  -cutoff ==> cut when below any minimizing node that have a  <=  (ancestor). 2,  -cutoff ==> cut when below any maximizing node that have a  >=  (ancestor).

Example  -cut  -cut max min max min

22 Alpha-Beta Pruning Algorithm Function ALPHA-BETA( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state, - , +  ) return the action corresponding with value v Function MAX-VALUE( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = -  for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s, ,  ) ) if v >=  then return v  = MAX( , v ) return v

23 Alpha-Beta Pruning Algorithm Function MIN-VALUE( state, ,  ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = +  for a, s in SUCCESSORS( state ) do v = MIN( v, MAX-VALUE( s, ,  ) ) if v <=  then return v  = MIN( , v ) return v

24 Alpha-Beta Pruning Example The nodes are “MAX nodes” The nodes are “MIN nodes”  =−   =+  , , initial values , , passed to kids  =−   =+ 

25 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes”  =−   =+   =−   =3 MIN updates , based on kids

26 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [- , +  ]

27 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [- , +  ] 12  =−   =3 MIN updates , based on kids. No change.

28 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] is returned as node value. MAX updates , based on kids.

29 Alpha-Beta Pruning Example 3 [- , 3] [3, +  ] 128 The nodes are “MAX nodes” The nodes are “MIN nodes”

30 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 128 [3,+  ] , , passed to kids  =3  =+ 

31 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] MIN updates , based on kids.  =3  =2  ≥ , so prune. XX

32 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] XX 2 is returned as node value. MAX updates , based on kids. No change.

33 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3,+  ] 1282 [3, 2] , , passed to kids  =3  =+  XX

34 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] 14 [3, 14] MIN updates , based on kids.  =3  =14 X X

35 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] 14 [3, 5] 5 MIN updates , based on kids.  =3  =5 X X

36 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] 14 [3, 2] 5 2 MIN updates , based on kids.  =3  =2 XX

37 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] 14 [3, 2] is returned as node value. X X

38 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, +  ] 1282 [3, 2] 14 [3, 2] 5 2 MAX updates , based on kids. No change. X X

Example -which nodes can be pruned? Max Min Max

Answer to Example -which nodes can be pruned? Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side) Max Min Max

Second Example (the exact mirror image of the first example) which nodes can be pruned?

Answer to Second Example (the exact mirror image of the first example) -which nodes can be pruned? Min Max Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side).

점진적 심화방법 Time limits  unlikely to find goal, must approximate

Iterative (Progressive) Deepening : 점진적 심화방법 In real games, there is usually a time limit T on making a move How do we take this into account? using alpha-beta we cannot use “partial” results with any confidence unless the full breadth of the tree has been searched – So, we could be conservative and set a conservative depth- limit which guarantees that we will find a move in time < T disadvantage is that we may finish early, could do more search In practice, iterative deepening search (IDS) is used –IDS runs depth-first search with an increasing depth-limit –when the clock runs out we use the solution found at the previous depth limit

점진적 심화방법

49 Iterative deepening search l =0

50 Iterative deepening search l =1

51 Iterative deepening search l =2

52 Iterative deepening search l =3

Heuristic Continuation: fight horizon effect