Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CSP and Games 159.302 CSP and Games Introduction 5 Constraint Satisfaction Problems Source of contents: MIT OpenCourseWare.

Similar presentations


Presentation on theme: "1 CSP and Games 159.302 CSP and Games Introduction 5 Constraint Satisfaction Problems Source of contents: MIT OpenCourseWare."— Presentation transcript:

1 1 CSP and Games 159.302 CSP and Games Introduction 5 Constraint Satisfaction Problems Source of contents: MIT OpenCourseWare

2 2CSP General class of problems: BINARY CSP Application areas of CSPs: scheduling tasks, robot planning tasks, puzzles, molecular structures, sensor interpretation tasks, etc. 5 This diagram is called a constraint graph. Variable V i with values in domain D i Unary constraint arc Binary constraint arc Unary constraints just cut down domains.

3 3CSP General class of problems: BINARY CSP 5 This diagram is called a constraint graph. Variable V i with values in domain D i Unary constraint arc Binary constraint arc Unary constraints just cut down domains. Basic problem: Find a d j Є D j for each V i s.t. all constraints are satisfied (finding consistent labeling for variables)

4 4CSP N-Queens as CSP Classic “benchmark” problem 5 are board positions in N × N chessboard Variables Place N queens on an N × N chessboard so that none can attack the other. Q Q Q Q 12341234 1 2 3 4 Queen or blank Domains Two positions on a line (vertical, horizontal, diagonal) cannot both be Queen Constraints

5 5CSP Line labelings as CSP 5 are line junctions Variables Labeling lines in drawing as convex (+), concave (-), or boundary (>). are set of legal labels for that junction type Domains shared lines between adjacent junctions must have same label. Constraints All legal junction labels for four junction types.

6 6CSP Scheduling as CSP 5 are activities Variables Choose time for activities (e.g. observations on Hubble telescope, or terms to take required classes). are sets of start times (or “chunks” of time) Domains 1.Activities that use same resource cannot overlap in time. 2.Preconditions satisfied. Constraints activity time

7 7CSP Graph Colouring as CSP 5 are regions Variables Pick colours for map regions, avoiding coloring adjacent regions with the same colour. are colours allowed Domains adjacent regions must have different colours Constraints

8 8CSP 3-SAT as CSP Boolean Satisfiability problems - the original NP-complete problem 5 are clauses Variables Find values for boolean variables A, B, C, … that satisfy the formula. (A or B or !C) and (!A or C or B) boolean variable assignments that make the clause true Domains clauses with shared boolean variables must agree on value of variable. Constraints

9 9CSP Model-based recognition as CSP 5 are edges in model Variables Find given model in edge image, with rotation and translation allowed set of edges in image Domains angle between model & image edges must match Constraints

10 10CSP Good News / Bad News 5 very general & interesting class problems Good News includes NP-Hard (intractable) problems Bad News So, good behaviour is a function of domain and not the formulation as CSP.

11 11CSPExample 5 Given 40 courses (8.01, 8.2, …, 6.840) & 10 terms (Fall 1, Spring 1, …, Spring 5). Find a legal schedule.

12 12CSPExample 5 Given 40 courses (8.01, 8.2, …, 6.840) & terms (Fall 1, Spring 1, …, Spring 5). Find a legal schedule. Pre-requisities Courses offered on limited terms Limited number of courses per term Avoid time conflicts Constraints

13 13CSPExample 5 40 courses 10 terms Given 40 courses (8.01, 8.2, …, 6.840) & 10 terms (Fall 1, Spring 1, …, Spring 5). Find a legal schedule. Pre-requisities Courses offered on limited terms Limited number of courses per term Avoid time conflicts Constraints Note: CSPs are not for expressing (soft) preferences (e.g. minimise difficulty, balance subject areas, etc.)

14 14CSPExample 5 Legal combinations of for example 4 courses (but this is huge set of values) Variables A. Terms? Choice of Variables & Values Domains

15 15CSPExample 5 Legal combinations of for example 4 courses (but this is huge set of values) Variables A. Terms? Choice of Variables & Values Domains Courses offered during that termB. Terms Slots? Subdivide terms into slots (e.g. 4 of them (Fall 1, 1)(Fall 1, 2) (Fall 1, 3)(Fall 1, 4)

16 16CSPExample 5 Legal combinations of for example 4 courses (but this is huge set of values) Variables A. Terms? Choice of Variables & Values Domains Courses offered during that termB. Terms Slots? Subdivide terms into slots (e.g. 4 of them (Fall 1, 1)(Fall 1, 2) (Fall 1, 3)(Fall 1, 4) Terms or term slots (term slots allow expressing constraint on limited number of courses / term) C. Courses?

17 17CSPExample 5 Prerequisite Constraints Use courses as variables and term slots as values. For pairs of courses that must be ordered. 6.0016.034 Term before Term after

18 18CSPExample 5 Prerequisite Constraints Use courses as variables and term slots as values. For pairs of courses that must be ordered. 6.0016.034 Term before Term after Courses offered only in some terms Filter domain

19 19CSPConstraints 5 Prerequisite Use courses as variables and term slots as values. For pairs of courses that must be ordered. 6.0016.034 Term before Term after Courses offered only in some terms Filter domain Limit # courses slot not equal for all pairs of variables Use term-slots only once

20 20CSPConstraints 5 Use courses as variables and term slots as values. Avoid time conflicts term not equal For pairs offered at same or overlapping times Prerequisite For pairs of courses that must be ordered. 6.0016.034 Term before Term after Courses offered only in some terms Filter domain Limit # courses slot not equal for all pairs of variables Use term-slots only once

21 21 CSP 159.302 CSP Solving CSPs 5 Source of contents: MIT OpenCourseWare

22 22 Solving CSPs 5 Approaches to solving CSPs are some combination of constraint propagation and search. 1. Constraint propagation – to eliminate values that could not be part of any solution 2. Search – to explore valid assignments

23 23 Solving CSPs Constraint Propagation (aka Arc Consistency) 5 Arc consistency Arc consistency eliminates values from domain of variable that can never be part of a consistent solution. V i → V j consistent Directed arc (V i, V j ) is arc consistent if For every there exists some

24 24 Solving CSPs Constraint Propagation (aka Arc Consistency) 5 Arc consistency eliminates values from domain of variable that can never be part of a consistent solution. V i → V j Directed arc (V i, V j ) is arc consistent if We can achieve consistency on arc by deleting values from D i (domain of variable at tail of constraint arc) that fail this condition.

25 25 Solving CSPs Constraint Propagation (aka Arc Consistency) 5 Arc consistency eliminates values from domain of variable that can never be part of a consistent solution. V i → V j Directed arc (V i, V j ) is arc consistent if We can achieve consistency on arc by deleting values from D i (domain of variable at tail of constraint arc) that fail this condition. Assume domains are of size d at the most, and there are e binary constraints.

26 26 Solving CSPs Constraint Propagation (aka Arc Consistency) 5 Arc consistency eliminates values from domain of variable that can never be part of a consistent solution. V i → V j Directed arc (V i, V j ) is arc consistent if We can achieve consistency on arc by deleting values from D i (domain of variable at tail of constraint arc) that fail this condition. Assume domains are size at most d and there are e binary constraints. O(ed 3 ) O(d 2 ) A simple algorithm for arc consistency is O(ed 3 ) – note that just verifying arc consistency takes O(d 2 ) for each arc.

27 27CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated Each variable is constrained to have values different from its neighbors R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3

28 28CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated Each undirected constraint arc is really two directed constraint arcs, the effects shown above are from examining both arcs. R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3 Arc examined Value deleted R, G R, G, B G V1V1 V2V2 V3V3

29 29CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated Each undirected constraint arc is really two directed constraint arcs, the effects shown above are from examining both arcs. R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3 Arc examined Value deleted V 1 -V 2 none R, G R, G, B G V1V1 V2V2 V3V3

30 30CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated Each undirected constraint arc is really two directed constraint arcs, the effects shown above are from examining both arcs. R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3 Arc examined Value deleted V 1 -V 2 none V 1 -V 3 V 1 (G) R, G R, B G V1V1 V2V2 V3V3

31 31CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated Each undirected constraint arc is really two directed constraint arcs, the effects shown above are from examining both arcs. R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3 Arc examined Value deleted V 1 -V 2 none V 1 -V 3 V 1 (G) V 2 -V 3 V 2 (G) R R, B G V1V1 V2V2 V3V3

32 32CSP Constraint Propagation Example 5 Graph Colouring Initial domains are indicated In general we need to make one pass through any arc whose head variable has changed until no further changes are observed before we can stop. R, G Different colour constraint R, G, B G V1V1 V2V2 V3V3 Arc examined Value deleted V 1 -V 2 none V 1 -V 3 V 1 (G) V 2 -V 3 V 2 (G) V 1 -V 2 V 1 (R) V 1 -V 3 none V 2 -V 3 none R B G V1V1 V2V2 V3V3

33 33CSP But, arc consistency is not enough in general! 5 Graph Colouring R, G V1V1 V2V2 V3V3 Arc consistent but NO SOLUTIONS We need one colour for each variable!

34 34CSP But, arc consistency is not enough in general! 5 Graph Colouring R, G V1V1 V2V2 V3V3 Arc consistent but NO SOLUTIONS R, G B, G R, G V1V1 V2V2 V3V3 Arc consistent but 2 SOLUTIONS: B, R, G B, G, R

35 35CSP But, arc consistency is not enough in general! 5 Graph Colouring R, G V1V1 V2V2 V3V3 Arc consistent but NO SOLUTIONS R, G B, G R, G V1V1 V2V2 V3V3 Arc consistent but 2 SOLUTIONS: B, R, G B, G, R R, G B, G R, G V1V1 V2V2 V3V3 Arc consistent but 1 SOLUTION Assume B, R not allowed

36 36CSP But, arc consistency is not enough in general! 5 Graph Colouring R, G V1V1 V2V2 V3V3 Arc consistent but NO SOLUTIONS R, G B, G R, G V1V1 V2V2 V3V3 Arc consistent but 2 SOLUTIONS: B, R, G B, G, R R, G B, G R, G V1V1 V2V2 V3V3 Arc consistent but 1 SOLUTION Assume B, R not allowed Search We need to apply Search algorithms to find solutions (if there is any)

37 37CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3

38 38CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3 Backup at inconsistent assignment. Inconsistent with V 1 = R

39 39CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3 Backup at inconsistent assignment. Inconsistent with V 1 = R

40 40CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3 Backup at inconsistent assignment. Inconsistent with V 1 = R

41 41CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3 Backup at inconsistent assignment. Inconsistent with V 1 = R Inconsistent with V 2 = G

42 42CSP5 V 1 assignments When we have too many values in domain (and/or constraints are weak) arc consistency doesn’t do much, so we need to search. Simplest approach is pure backtracking (depth-first search). V 2 assignments V 3 assignments R G B R RRR R GGG RR GG G G R, G R, G, B R, G V1V1 V2V2 V3V3 Backup at inconsistent assignment. Inconsistent with V 1 = R Inconsistent with V 2 = G

43 43 Solving CSPs Combine Backtracking & Constraint Propagation 5 A node in BT tree is a partial assignment in which the domain of each variable has been set (tentatively) to singleton set. Use constraint propagation (arc-consistency) to propagate the effect of the tentative assignment, i.e. eliminate values inconsistent with current values.

44 44 Solving CSPs Combine Backtracking & Constraint Propagation 5 A node in BT tree is a partial assignment in which the domain of each variable has been set (tentatively) to singleton set. Use constraint propagation (arc-consistency) to propagate the effect of the tentative assignment, i.e. eliminate values inconsistent with current values. How much propagation to do?

45 45 Solving CSPs Combine Backtracking & Constraint Propagation 5 A node in BT tree is a partial assignment in which the domain of each variable has been set (tentatively) to singleton set. Use constraint propagation (arc-consistency) to propagate the effect of the tentative assignment, i.e. eliminate values inconsistent with current values. How much propagation to do? Answer: Not much, just local propagation from domains with unique assignments, which is called forward checking (FC). This conclusion is not necessarily obvious, but generally holds in practice.

46 46CSP5 V 1 assignments When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. V 2 assignments V 3 assignments R R, G R, G, B R, G V1V1 V2V2 V3V3 Backtracking with Forward Checking (BT-FC)

47 47CSP5 V 1 assignments V 2 assignments V 3 assignments R G R G V1V1 V2V2 V3V3 G We eliminate any values that are inconsistent with the assignment. When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

48 48CSP5 V 1 assignments V 2 assignments V 3 assignments R G R V1V1 V2V2 V3V3 G We have a conflict whenever a domain becomes empty. When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

49 49CSP5 V 1 assignments V 2 assignments V 3 assignments G When backing up, we need to restore domain values, since deletions were done to reach consistency with tentative assignments considered during search. R, G R, G, B R, G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

50 50CSP5 V 1 assignments V 2 assignments V 3 assignments G We eliminate G from V 2 and V 3. R G R V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

51 51CSP5 V 1 assignments V 2 assignments V 3 assignments G We now consider V 2 = R and propagate. R G R V1V1 V2V2 V3V3 R When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

52 52CSP5 V 1 assignments V 2 assignments V 3 assignments G The domain of V 3 is now empty and so we fail and backup. R G V1V1 V2V2 V3V3 R When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. Backtracking with Forward Checking (BT-FC)

53 53CSP5 V 1 assignments V 2 assignments V 3 assignments B R, G R, G, B R, G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. So, we move to consider V 1 = B and propagate. Backtracking with Forward Checking (BT-FC)

54 54CSP5 V 1 assignments V 2 assignments V 3 assignments B R, G B V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. The propagation does not delete any values. We pick V 2 = R and propagate. R Backtracking with Forward Checking (BT-FC)

55 55CSP5 V 1 assignments V 2 assignments V 3 assignments B R B G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. This removes the R values in the domains of V 1 and V 3. R Backtracking with Forward Checking (BT-FC)

56 56CSP5 V 1 assignments V 2 assignments V 3 assignments B R B G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. We pick V 3 = G and have a consistent assignment. R G Backtracking with Forward Checking (BT-FC)

57 57CSP5 V 1 assignments V 2 assignments V 3 assignments B R B G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. We can continue the process to find the other consistent solution. R G Backtracking with Forward Checking (BT-FC)

58 58CSP5 V 1 assignments V 2 assignments V 3 assignments B Backtracking with Forward Checking (BT-FC) R B G V1V1 V2V2 V3V3 When examining an assignment V i = d k, remove any values inconsistent with that assignment from neighboring domains in constraint graph. No need to check previous assignments R G Generally preferable to pure BT.

59 59 CSP and Games 159.302 CSP and Games Solving CSPs: Other Strategies 5 Source of contents: MIT OpenCourseWare

60 60 Solving CSPs BT-FC with Dynamic Ordering 5 ordering of variables & values Traditional backtracking uses fixed ordering of variables & values, e.g. random order or place variables with constraints first. You can usually do better by choosing an order dynamically as the search proceeds. have a substantial effect on the cost of finding the answer Ordering of variables can have a substantial effect on the cost of finding the answer. We can re- order variables based on information available during a search.

61 61 Solving CSPs BT-FC with Dynamic Ordering 5 Traditional backtracking uses fixed ordering of variables & values, e.g. random order or place variables with constraints first. You can usually do better by choosing an order dynamically as the search proceeds. Most constrained variable when doing forward-checking, pick variable with fewest legal values to assign next (minimise branching factor)

62 62 Solving CSPs BT-FC with Dynamic Ordering 5 Traditional backtracking uses fixed ordering of variables & values, e.g. random order or place variables with constraints first. You can usually do better by choosing an order dynamically as the search proceeds. Most constrained variable when doing forward-checking, pick variable with fewest legal values to assign next (minimise branching factor) Least constraining value choose value that rules out the fewest values from neighboring domains

63 63 Solving CSPs BT-FC with Dynamic Ordering 5 Traditional backtracking uses fixed ordering of variables & values, e.g. random order or place variables with constraints first. You can usually do better by choosing an order dynamically as the search proceeds. Most constrained variable when doing forward-checking, pick variable with fewest legal values to assign next (minimise branching factor) Least constraining value choose value that rules out the fewest values from neighboring domains e.g. This combination improves feasible N-Queens performance from about n=30 with just FC to about n=1000 with FC & ordering

64 64 Solving CSPs BT-FC with Dynamic Ordering 5 Which country should we colour next? The 4-Colour Map- Colouring Problem illustrates a simple situation for variable and value ordering. Colours: R, G, B, Y Which colour should we pick for it?

65 65 Solving CSPs BT-FC with Dynamic Ordering 5 Which country should we colour next? The 4-Colour Map- Colouring Problem illustrates a simple situation for variable and value ordering. Colours: R, G, B, Y Which colour should we pick for it? E is most constrained variable (smallest domain)

66 66 Solving CSPs BT-FC with Dynamic Ordering 5 Which country should we colour next? The 4-Colour Map- Colouring Problem illustrates a simple situation for variable and value ordering. Colours: R, G, B, Y Which colour should we pick for it? E is most constrained variable (smallest domain) Red – least constraining value (eliminates fewest values from neighboring domains)

67 67 Solving CSPs Incremental Repair (Min-Conflict Heuristic) 5 1.Initialise a candidate solution using “greedy” heuristic – get solution “near” correct one. 2.Select a variable in conflict and assign it a value that minimises the number of conflicts (break ties randomly). Can use this heuristic as part of systematic backtracker that uses heuristics to do value ordering or in a local hill-climber (without backup). Size(n) Sec. (Sparc 1) Performance on N-Queens (with good initial guess)

68 68 Solving CSPs Min-Conflict Heuristic 5 The pure hill climber (without backtracking) can get stuck in local minima. Can add random moves to attempt getting out of minima – generally quite effective. Can also use weights on violated constraints & increase weight every cycle if it remains violated. Restart the search with a new random initial state. Randomised hill-climber used to solve SAT problems. One of the most effective methods ever found for this problem. GSAT hard GSAT can solve SAT problems of mind- boggling complexity. It has set a new standard for classifying SAT problems as “hard”, because almost any random problem is “easy” for GSAT.

69 69 Solving CSPs GSAT as Heuristic Search 5 State Space: State Space: Space of all full assignments to variables Initial State: Initial State: a random full assignment Goal State: Goal State: a satisfying assignment Actions: Actions: flip value of one variable in current assignment Heuristic: Heuristic: the number of satisfied clauses (constraints); we want to maximise this score. Alternatively, minimise the number of unsatisfied clauses (constraints).

70 70 Solving CSPs Algorithm: GSAT(F) 5 For i=1 to MaxTries Select a complete random assignment A Score = number of satisfied clauses For i=1 to MaxFlips If (A satisfies all clauses in F) { return A } Else { Flip a variable that maximises the Score } Flip a randomly chosen variable if no variable flip increases the Score local minima MaxTries and MaxFlips are user-defined. These guard against local minima in the search.

71 71 Solving CSPs Algorithm: WALKSAT(F) 5 For i=1 to MaxTries Select a complete random assignment A Score = number of satisfied clauses For i=1 to MaxFlips If (A satisfies all clauses in F) { return A } Else { GSAT With probability p //GSAT Flip a variable that maximises the Score Flip a randomly chosen variable if no variable flip increases the Score Random Walk With probability (1-p) //Random Walk Pick a random unsatisfied clause C Flip a randomly chosen variable in C } more randomness It turns out that adding more randomness is a more effective strategy!

72 72 CSP and Games 159.302 CSP and Games Introduction to Games 5 Source of contents: MIT OpenCourseWare Approaches to building two player games

73 73Games Board Games & Search 5 1949 1949 Shannon paper 1951 1951 Turing paper 1958 1958 Bernstein paper 55-60 55-60 Simon-Newell program (α-β McCarthy?) 66-67 66-67 MacHack 6 (MIT AI) 70’s 70’s NW Chess 4.5 80’s 80’s Cray Blitz 90’s 90’s Belle, Hitech, Deep Thought, Deep Blue Move generationMove generation Static evaluationStatic evaluation Min-MaxMin-Max Alpha-BetaAlpha-Beta Practical MattersPractical Matters Claude Shannon and his electromechanical mouse Theseus, one of the earliest experiments in artificial intelligence. Image Copyright 2001 Lucent Technologies, Inc. All rights reserved.

74 74Games Game Tree Search 5 Initial State: Initial State: initial board position and player Operators: Operators: one for each legal move Goal States: Goal States: winning board positions Scoring Function: Scoring Function: assigns numeric value to states Game tree: Game tree: encodes all possible games We are not looking for a path, only the next move to make (that hopefully leads to a winning position) Our best move depends on what the other player does.

75 75Games Move Generation 5 Chess b = 36 d > 40 36 40 is big!

76 76Games Partial Game Tree for Tic-Tac-Toe 5 Even for this trivial game, the search tree is quite big.

77 77Games Scoring Function 5 Assigns a numerical value to a board position.

78 78Games Scoring Function: Static Evaluation 5 A linear function in which some set of coefficients is used to weight a number of “features” of the board position. Too weak to predict ultimate success.

79 79Games Limited look ahead + Scoring 5 The Min-MaX Algorithm

80 80Games Min-MaX Algorithm 5 function MAX·VALUE(state, depth) if (depth == 0) then return EVAL(state) v = -∞ For each s in SUCCESSORS(state) do v = MAX(v, MIN·VALUE(s, depth – 1)) end return v function MIN·VALUE (state, depth) if (depth == 0) then return EVAL(state) v = ∞ For each s in SUCCESSORS(state) do v = MIN(v, MAX·VALUE(s, depth – 1)) end return v

81 81Games USCF Rating 5 Somehow, it seems as if brute-force search is all that matters.

82 82Games Deep Blue 5 32 SP2 processors each with 8 dedicated chess processors = 256 CP 50-100 billion moves in 3 min 13-30 ply search

83 83Games Alpha-Beta Pruning 5 α – is the lower bound on score β – is the upper bound on score 2 271 anything max min

84 84Games Alpha-Beta Pruning 5 function MAX·VALUE(state, α, β, depth) if (depth == 0) then return EVAL(state) For each s in SUCCESSORS(state) do α = MAX(α, MIN·VALUE(s, α, β, depth-1)) If(α ≥ β) Then return α //cut-off end return α function MIN·VALUE(state, α, β, depth) if (depth == 0) then return EVAL(state) For each s in SUCCESSORS(state) do β = MIN(β, MAX·VALUE(s, α, β, depth-1)) If(β ≤ α ) Then return β //cut-off end return β α – is the best score for MAX; β – is the best score for MIN Initial call is MAX·VALUE(state, -∞, ∞, MAX·DEPTH)

85 85Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ We start with an initial call to MAX·VALUE. MAX·VALUE(state, -∞, ∞, MAX·DEPTH)

86 86Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ MAX·VALUE now calls MIN·VALUE on the left successor with the same values of alpha and beta. MIN·VALUE now calls MAX·VALUE on its leftmost succesor. - ∞, ∞

87 87Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ MAX·VALUE is at the leftmost leaf, whose leaf value is 2 and so it returns that. - ∞, ∞

88 88Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ This first value, since it is less than ∞, becomes the new value of β in MIN·VALUE. - ∞, 2

89 89Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ So now we call MAX·VALUE with the next successor, which is also a leaf whose value is 7. - ∞, 2

90 90Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ 7 is not less than 2 and so the final value of β is 2 for this node. - ∞, 2

91 91Games Alpha-Beta Pruning in action 5 271 max min - ∞, ∞ MIN·VALUE now returns 2 to its caller. - ∞, 2 2

92 92Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ The calling MAX·VALUE now sets α to 2, since it is bigger than -∞. Note that the range of [alpha-beta] says that the score will be greater or equal to 2 (and less than ∞). - ∞, 2 2

93 93Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ MAX·VALUE now calls MIN·VALUE with an updated range of [alpha-beta]. - ∞, 2 2 2, ∞

94 94Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ MIN·VALUE calls MAX·VALUE on the left leaf and it returns a value of 1. - ∞, 2 2 2, ∞

95 95Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ This is used to update beta in MIN·VALUE, since it is less than ∞. Note that at this point, we have a range where α=2 is greater than β=1. - ∞, 2 2 2, 1

96 96Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ This is used to update beta in MIN·VALUE, since it is less than ∞. Note that at this point, we have a range where α=2 is greater than β=1. - ∞, 2 2 2, 1 This situation signals a cut-off in MIN·VALUE and it returns beta(=1), without looking at the right leaf. β ≤ α Cut-off!

97 97Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ - ∞, 2 2 2, 1 This situation signals a cut-off in MIN·VALUE and it returns beta(=1), without looking at the right leaf. β ≤ α Cut-off! So, basically we had already found a move that guaranteed us a score ≥ 2 so that when we got into a situation where the score was guaranteed to be ≤ 1, we could stop. anything

98 98Games Alpha-Beta Pruning in action 5 271 max min 2, ∞ - ∞, 2 2 2, 1 β ≤ α Cut-off! So, a total of 3 static evaluations were needed instead of the 4 we would have needed under pure Min·Max. anything

99 99Games α-β (NegaMax form) Alpha-Beta Pruning in a more compact form 5 function ALPHA·BETA(state, α, β, depth) if (depth == 0) then return EVAL(state) For each s in SUCCESSORS(state) do α = MAX(α, ALPHA·BETA(s, -β, -α, depth-1)) If(α ≥ β) Then return α //cut-off end return α α – is the best score for MAX; β – is the best score for MIN Initial call is ALPHA·BETA(state, -∞, ∞, MAX·DEPTH) Basically, this exploits the idea that minimizing is the same as maximising the negatives of the scores.

100 100Games Key points about α-β 5 1. Guaranteed same value as Max-Min. 2. In a perfectly ordered tree, expected work is O(b d/2 ) vs. O(b d ) for Max-Min, so can search twice as deep with the same effort! 3. With good move ordering, the actual running time is close to optimistic estimate.

101 101Games Game Program 5 1. Move generator (ordered moves)50% 2. Static evaluation40% 3. Search control10% In practice, Openings End games Played by looking up moves in a Database [all in place by late 60’s]

102 102Games Move Generator 5 1. Legal moves 2. Ordered by most valuable victim least valuable agressor 3. Killer heuristic

103 103Games Static Evaluation 5 InitiallyVery complex 70’sVery simple (material) Now Deep searches: moderately complex (hardware) PC programs: elaborate, hand-tuned

104 104Games Practical matters 5 Variable branching Iterative Deepening Order best move from last search first use previous backed up value to initialise [α, β] keep track of repeated positions (transposition tables) Horizon Effect quiescence pushing the inevitable over search horizon Parallelisation

105 105Games Practical matters 5 Backgammon Involves randomness – dice rolls machine-learning based player was able to draw the world champion Bridge Involves hidden information – other player’s cards, and communication during bidding Computer players play well but do not bid well Go No new elements but huge branching factor No good computer players exist

106 106GamesObservations 5 Computers excel in well-defined activities where rules are clear chess mathematics Success comes after a long period of gradual refinement For more details on building game programs, visit: http://www.ics.uci.edu/~eppstein/180a/w99.html


Download ppt "1 CSP and Games 159.302 CSP and Games Introduction 5 Constraint Satisfaction Problems Source of contents: MIT OpenCourseWare."

Similar presentations


Ads by Google