Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications

Similar presentations


Presentation on theme: "Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications"— Presentation transcript:

1 Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications David.Sinclair@computing.dcu.ie

2 What is Artificial Intelligence? There are many answers but the simplest definition is: A field of research whose goal is to make “machines” that do things that require intelligence if done by a human. Intelligence is the ability to learn and understand, to solve problems and make decisions.

3 Why add A.I. to games? From the A.I. “side of the house” –Games are excellent testbeds as they: Have well-defined rules generating a large search space Easily represented in a computer “easy” to test From the Games “side of the house” –A.I. can make the game much more enjoyable to play.

4 Search The brute force approach of search has been highly effective in games such as Draughts and Chess. –Draughts/Checkers Chinook (World Champion) –Chess Best programs can hold their own with the best humans. Deep Blue II –move generation and evaluation in hardware –parallel search in software

5 Total Search From the starting position 1.Generate every legal move for player 1. 2.For each legal move of player 1 generate every legal move for player 2. 3.Repeat steps 1 & 2 until the game reaches a definitive result.

6 Problem with Total Search Not practical –A player in chess has, on average, 36 legal moves. –A game could take 45 moves to reach a conclusion (underestimate). –Total number of positions = 36 90 –There is only ~10 81 atoms in the universe Couldn’t store all the positions in computer the size of the universe.

7 Evaluation Functions Searching from a position to a definitive result is not practical. Generate all possible outcomes in a fixed number of moves from a position. –Builds a game tree For each terminal position in the game tree calculate the likelihood that the terminal position will result in a win, loss or draw for the player moving.

8 Searching the Game Tree -3 204 -5 -3 -4 -2 0 1 MAX MIN MAX -2 4 -5 2-3 1 -2 -5 -3 -2 This is the Minimax Algorithm

9 Improving Minimax The Minimax Algorithm has various improvements that are used in paractice. –Alpha-Beta –Principle Variation Search (PVS) –Transposition Tables –Killer Move Heuristics At best they can halve the work of the search.

10 Computer Chess Deep Blue II –256 dedicated chess processors generate moves evaluate positions –Search process in software (PVS) –Database of opening sequences –Databases of endgame sequences Deep Blue II can evaluate 200 million possitions per second (3 billion in 3 minutes). Deep Blue II can hold its own with the best players in the world, but it is not invincible!

11 Learning Backgammon is a very interesting testing ground for computer game playing for two reasons: – the stochastic nature of the game; and –the experience that an accurate evaluation of a position is far more effective than a deep search. Backgammon (TD-Gammon) –In the top 10 (~6th) –One of the top human players says it has a better evaluation capability than him. –Has changed the way humans play backgammon.

12 Learning how to evaluate a position Evaluates positions with a neural network that has trained itself by playing over 200,000 games against itself. –From an initial state of knowing the rules and zero strategical/tactical knowlegde network learned a number of elementary strategies and tactics during the first few thousand training games against itself. After several tens of thousands of training games more sophisticated concepts began to emerge.

13 Learning how to evaluate a position (contd.) After 200,000 training games with a basic board encoding the network was as strong as its predecessor NeuroGammon. NeuroGammon was trained on a corpus of expert games and used a sophisticated board encoding. When TD ‑ Gammon was retrained using NeuroGammon’s board encoding, TD ‑ Gammon reached the level of strong master play.

14 Deep Anchors: TD-Gammon’s influence on humans MoveEstimateRollout 8-4*,8-4,11-7,11-7+0.184+0.139 What should white play when he rolls a double 4? 8-4*,8-4,21-17,21-17+0.238+0.221

15 Opponent Modeling - Poker Poker differs from games such as Chess and Draughts in two major respects. –it is a game of imperfect information –the game-theoretic optimal strategy does work as well as a maximising strategy in practice An essential element is bluffing (betting to give the impression that a bad hand is good) and sandbagging (betting to give the impression that a good hand is bad) –To do this you need to model your opponent!

16 Properties of a World Class Poker program Hand Strength Assessment Hand Potential Assessment Betting Strategy Bluffing Unpredictability Opponent Modelling

17 Loki Play Texas Hold’em –Pre ‑ flop: Each player is dealt two cards face down, followed by the first round of betting. –Flop: Three community cards are dealt face up and a second round of betting occurs. –Turn: A fourth community card is dealt face up and the third round of betting occurs. –River: A final fifth community card is dealt face up and the final round of betting occurs. There are 1326 possible combinations from the initial two cards.

18 Hand Strength Assessment in Loki Loki played a million hands to calculate the approximate income rate from each starting hand. After the pre ‑ flop, there are 47 remaining unknown cards and 1081 possible hands an opponent might hold. We can calculate how many of these hands, combined with the community cards, will lose to our hand, tie with our hand or be beaten by our hand. –For example, if our hand is A  -Q  and the flop is 3 - 4  -J then 444 cases would beat us, 9 would tie and the remaining 628 cases would lose to our hand. Therefore our hand strength is 0.585 (58.5%).

19 Opponent Modeling in Loki For each of the possible 1081 combinations of hole cards an opponent may have a weight is assigned to it. –These weights are determined by the 169 distinct income rates determined by simulation. There are 36 possible classes of opponent actions depending on: –their action (fold, call/check, bet/raise), –how much the action costs (bets of 0, 1, >1) and –when the action occurred (pre ‑ flop, flop, turn, river).

20 Opponent Modeling in Loki (contd.) Each action modifies the probabilities for each of the possible 1081 combinations of hole cards an opponent may have. We can make the opponent models interact so that if one player’s hand has a very high probability of containing an A , then we can reduce the weights on other player’s hands that also contain the A .

21 Intelligent Opponents? A simple way to make an opponent appear intelligent is to use a stochastic state machine. –Stochastic  random element –State machine is a program that is in a definite state and only moves from state to state depending on how it is interacted with.

22 Example (loosely based on Civilisation II) There are a collection of civilisations competing for shared resources. A civilisation behaviour to another civilisation is influenced by: –The goodwill [0...100] between the two civilisations; and –The expected/actual gain [-100...100]. An action will modify the goodwill: Give technological gift+25Break a treaty with another-20 Enter treaty+20Break a cease-fire-30 Keep cease-fire+10Encroach on territory-10 Assist this civilisation+40Attack this civilisation-20

23 State machine ALLIED treaty = 0.8 assist = 0.8 COOPERATIVE treaty = 0.4 assist = 0.2 Neutral treaty = 0.1 attack = 0.1 assist = 0.05 AGGRESIVE attack = 0.6 ceasefire = 0.3 peace = 0.8 HOSTILE attack = 0.95 ceasefire = 0.05 goodwill > 30 or gain > -50 gain >10 goodwill > 45 30 < goodwill < 45 or gain > 60 goodwill 90 goodwill < 85 goodwill < 65 goodwill > 65 goodwill < 65 goodwill > 85

24 Go Go represents the biggest challenge yet to the application of A.I. in games. None of the existing techniques has proved sucessful. Go will require a combinations of techniques. –Pattern matching –Search (Forward prunning and focusing) –Planning –Resolving threats and plans

25 Go – the game Go is played on a 19x19 grid made of horizontal and vertical lines. Each player, black and white, place stones on the intersection points of the grid. Once a stone is placed it cannot be moved, unless it is captured. Each stone or set of vertically and/or horizontally connected stones has a set of liberty points. These are the vertically and horizontally unoccupied adjacent grid points.

26 Go – the game (contd.) To capture a group of stones all you need do is reduce the group’s number of liberty points to zero. There are 2 restrictions on placing stones on the board. –The first is that you cannot place a stone on a point that would result in it having no liberties. This is called suicide. –The second is that you cannot immediately play a stone on a point that has just been captured. You must play the stone elsewhere on the board on the move immediately after the capture. Then you can return to the capture point.

27 Capture and liberties a If white plays a stone at the point a then the three black stones will be captured and removed from the board.

28 Eyes ab This black group of stones can never be captured since white would have to remove both the liberties at the a and b point at the same time. But white can only play one stone at a time, and white cannot play into a or b as this is suicide.

29 The result At each turn a player has the option of placing a stone on the board or passing (skipping their move). The game continues with both players placing stones on the board until both players pass consecutively. The result is determined by each player adding up the territory they control plus the number of the opponents stones they have captured. Each territory point controlled and each stone captured are worth one point.

30 Go – the great challenge Huge branching factor (~180 for Go, ~36 for Chess) evaluation of a position –in Chess there is a good correlation between the strength of position and the number and quality of pieces. –in Go there is a poor correlation between the strength of the position and the number of stones and territory surrounded. Best Go program is standard of average player (despite a $1 million prize).


Download ppt "Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications"

Similar presentations


Ads by Google