Download presentation

Presentation is loading. Please wait.

Published byValentina Prickett Modified over 3 years ago

1
1 Using Partial Order Bounding in Shogi Game Programming Workshop 2003 Reijer Grimbergen, Kenji Hadano and Masanao Suetsugu Department of Information Science Saga University

2
2 Contents Why Partial Order Bounding? The problems of using a scalar evaluation function What is Partial Order Bounding? Using Partial Order Bounding in Shogi Implementation issues Results Conclusions and Future Work

3
3 Why Partial Order Bounding? Scalar evaluation function Perfect play in two-player perfect information games Mini-max search until the game theoretical value of the current position is known Infeasible for most interesting games Search needs to be cut off before the game theoretical value is known An evaluation function is needed to estimate the probability of winning when the search is terminated The evaluation function Contains most of the domain-dependent knowledge Generally a weighted sum of feature values:

4
4 Why Partial Order Bounding? Problems of a scalar evaluation Unstable positions Long term strategic features Large weights will give tactical problems Small weights make it impossible to follow long term plans Close to terminal positions Sometimes a single feature is enough for a conclusion PositionMaterialAttackEvaluation P1P1 000 P2P2 500–5000 P3P3 5000 Possible solution: Partial Order Bounding

5
5 What is Partial Order Bounding? Partial Order Evaluation Partial order evaluation Keep the complete set of feature values Compare the feature values to decide which position is better f1f1 f2f2 f3f3 f4f4 P1P1 P2P2 Comp f1f1 f2f2 f3f3 f4f4 f1f1 f2f2 f3f3 f4f4 1025 1020 50P 1 > P 2 1025 20 P 1 < P 2 10 2510 30P 1 < P 2

6
6 What is Partial Order Bounding? The problem Why is partial order evaluation not enough? Which is better: P 1 or P 2 ? The problem: Antichains A subset of the partial order for which all pairs of distinct elements are incomparable Example: {f 2, f 3 } is an antichain f1f1 f2f2 f3f3 f4f4 Posf1f1 f2f2 f3f3 f4f4 P1P1 1050200 P2P2 1015300

7
7 What is Partial Order Bounding? Dealing with antichains Simple approach: keep partially ordered values in every node of the search tree Leads to large sets of incomparable options Reducing these sets leads to loss of information Partial Order Bounding Separate comparison and value back up Define a target vector with targets for each of the feature values in the antichain Use search to determine if the target can be reached

8
8 What is Partial Order Bounding? Example of partial order bounding A D B EF C G T 1 = {5, 3} T 2 = {6,4} (11, 5) (5, 7)(6, 8)(4, 3) T 1 : + T 2 : + T 1 : + T 2 : – T 1 : + T 2 : + T 1 : – T 2 : – T 1 : + T 2 : – T 1 : – T 2 : – T 1 : + T 2 : –

9
9 Partial Order Bounding in Shogi Implementation decisions Which partial order evaluation to use? How to set the search targets? What to do if the search target is met or fails? What search depth should be used?

10
10 Partial Order Bounding in Shogi Partial order evaluation We have used the following antichain Material Strength of attack Strength of defense This partial order evaluation is Representative Has dominating features

11
11 Partial Order Bounding in Shogi Setting the search targets Setting the target too low Many moves for which the target is met: which one to choose? Setting the target too high No moves for which the target is met: no move can be played Our solution Perform a shallow α–βsearch and use the result as the first target

12
12 Partial Order Bounding in Shogi Success and failure POB is a series of searches with different bounds Problems in this approach How to set the targets to minimize the number of iterations? Which targets to increase or decrease? MovePOB iteration 12345 M1M1 TF M2M2 F M3M3 TFFT M4M4 FFF No general solution: tuning problem

13
13 Partial Order Bounding in Shogi Search depth In POB there is no definite target check A deeper search can reveal that the target is unreachable Optimization Target is reached if it the player to move has reached its target Not very likely to avoid a search explosion Another tuning problem

14
14 Results Implementation schemes Target settings Scheme A (equal weight): Increasing or decreasing all three search targets by 250 Scheme B (more weight to material): Increasing or decreasing the material feature by 400 and attack and defense by 100 Scheme C (more weight to attack): Increasing or decreasing the attack feature by 400 and material and defense by 100 Note: the defense feature did not give good results Really part of the antichain? If the target fails or succeeds for all moves, the target changes are halved

15
15 Results Search depth 3-ply α–β search to determine the initial search targets 3, 4 and 5-ply searches for the POB iterations 50 test problems The first (easiest) problem from Shukan Shogi 750 to 799

16
16 Results Test problem results 3-ply POBABC Solved17 15 Avg. Time per problem0:070:100:05 4-ply POBABC Solved231927 Avg. Time per problem1:001:470:48 5-ply POBABC Solved272325 Avg. Time per problem17:2326:1312:19

17
17 Results Discussion 4-ply POB using scheme C gives the best results 27 solved problems in 48 seconds on average Surprisingly, giving more weight to attack gives better results than giving more weight to material Increasing by 400 not the best? Setting the search target has a big impact For 4-ply POB there are only 6 problems that are solved by all three implementation schemes

18
18 Conclusions and Future Work POB can not be considered a general solution to the problem of using scalar evaluation functions Careful tuning is needed to use POB in a specific game What to do if time runs out without finding a single best move? POB is an interesting search method for shogi Searching different targets in parallel Combining POB with a normal minimax search

Similar presentations

OK

Vote Elicitation with Probabilistic Preference Models: Empirical Estimation and Cost Tradeoffs Tyler Lu and Craig Boutilier University of Toronto.

Vote Elicitation with Probabilistic Preference Models: Empirical Estimation and Cost Tradeoffs Tyler Lu and Craig Boutilier University of Toronto.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on dos operating system Ppt on waxes for cars Ppt on world peace day Ppt on bombay stock exchange Ppt on history of google Ppt on jewellery management system Ppt on airline industry in india Ppt on different solid figures first grade Ppt on best practices in hr Ppt on mid point theorem for class 9