Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Intelligence for Games Online and local search

Similar presentations


Presentation on theme: "Artificial Intelligence for Games Online and local search"— Presentation transcript:

1 Artificial Intelligence for Games Online and local search
Patrick Olivier

2 Local search In many optimisation problems we are interested in the solution rather than the path to it. Previously the state space was the paths to goal Now the state space is "complete" configurations The approach is to find one or more configurations satisfying constraints then use a local search algorithm and try to improve it So local search algorithms need only keep a single "current" state rather than a current path, thus being very memory efficient even in very large problems. Search algorithms so far have considered the solution to be the path through the state spaces to the goal – however, many problems only care about the final solution, and not the path. ‘Local search’ uses a single current state, generally only move to neighbours of that state. Memory efficient. Although not systematic, use little memory, often can find reasonable solutions in a large/continuous space that systematic algorithms wouldn’t.

3 Example problems Scheduling Layout Evolution Travelling salesman
N-queens Scheduling a set of jobs. Layout of a factory floor (efficient routes, supplies, power, heat/cool, etc.); layout of an integrated circuit. Creatures evolving to suit their environment. Travelling salesman visiting all cities. N-queens... (next slide)

4 Example – N-queens Given a n×n chess board, place n queens without any being able to take another in a single chess move. i.e. Only a single queen in any row, column, or diagonal. Left example a 4-queens problem (n=4). Talk about the 3 steps shown, initial state topmost. Counting columns from left, queen 4 in same row as queen 2, move queen 4. Queen 2 in same diagonal as queen 1, move queen 2. etc

5 Objective function Local search problems need an objective function.
This may be the nearness to goal or simply the negative of the “cost” of a given state. A high objective function denotes a good solution. In addition to finding goals, local search algorithms are useful for solving pure optimisation problems in which the aim is to find the best state according to an objective function. Trying to maximize the objective function: negative of cost -> minimize cost.

6 Example objective functions
Scheduling: negative time to complete, time spend working/wasted time. Layout: number of items/amount of space Evolution: reproductive success of a species Travelling salesman: negative number of cities visited twice or more N-queens: number of non-attacking queens Scheduling a set of jobs. Layout of a factory. Creatures evolving to suit their environment. Travelling salesman visiting all cities. N-queens...

7 State space landscape To understand state space search, it’s very useful to consider the state space landscape. One-dimensional state space. As the vertical axis is the objective function, our aim is to find the highest peak in elevation – the global maximum. Local maximum is locally the largest – can be flat. Overall global maximum – may be “shoulder” areas of flat elevation before rising to a peak.

8 State space landscape Often more than 1D!
Two dimensional state space, plotted as a three dimensional graph with the evaluation function represented as the vertical height. Peaks and valleys – local maxima / local minima.

9 Hill-climbing search “Climbing Everest in thick fog with amnesia”
Repeatedly “climb” to the neighboring state with highest objective function until no neighbor has higher objective function. i.e. climbs uphill until it reaches a local maxima. Greedy solution – moves “uphill” Like “Trying to find the top of Mount Everest (global maximum) in a thick fog (neighbours only) with amnesia (no search path kept)”.

10 Local maxima/minina Problem: depending on initial state, can get stuck in local maxima/minina 1/ (1+H(n)) = 1/17 Can formulate 8-queens problem for hill-climbing. Typically one queen per column, the successor function generates each of the possible moves (8 queens * 7 new positions = 56 successors). Example heuristic is the number of pairs of queens attacking. Upper board: h = 17, showing the value of h for each queen (within its column), best moves marked. Lower board: Five states later, stuck at a local minima with h=1, all successors have a higher cost. Local maxima: peaks that are higher than their neighbours but lower than the global maximum. Plateaux: Area of the state space landscape where the evaluation function is flat – either flat local maximum or a shoulder. It can be difficult for hill-climbing search to navigate its way of a plateau (there is no direction for its search). (Can allow limited number of sideways moves to test). Click – Ok if start here Click – Not okay many other places. 1/ (1+H(n)) = 1/2 Local minima

11 Local beam search keep track of k states rather than just one
start with k randomly generated states at each iteration, all the successors of all k states are generated if any one is a goal state, stop; else select k best successors from complete list and repeat. Hedge your bets by starting in multiple places. The best successors of all states -- not the same as k parallel searches – useful information is passed between the k parallel threads. Problem of all searches becoming concentrated in the wrong area. (Variant stochastic beam search which chooses successors randomly but weighted by objective function)

12 Simulated annealing search
Idea: escape local maxima by allowing some "bad" moves but gradually decrease their frequency and range (VSLI layout, scheduling) Hill climbing algorithms that never make a ‘downhill’ move are guaranteed to be incomplete. Purely random walks are complete but very inefficient (-> infinite) Inspired by annealing in metallurgy where metal is used to temper materials by heating them to a high temperature before cooling slowly and coalesce into a low-energy crystalline state. Instead of picking the best, it picks a random move and accepts it if it improves the situation, other will accept with some probability <1 that decreases exponentially with the ‘badness’ (delta-E) of the move.

13 Simulated annealing example
Point feature labelling Multiple candidate points chose, nudge boxes around.

14 Genetic algorithm search
population of k randomly generated states a state is represented as a string over a finite alphabet (often a string of 0s and 1s) evaluation function (fitness function) - higher values for better states. produce the next generation of k states by selection, crossover, and mutation rates for each of these configure search elitism / crossover rate / mutation rate Variant of a stochastic beam search in which successor states are generated by combining two parent states. Population: K randomly generated states. Individual: A string over a finite alphabet Fitness function: Evaluates each state. Probability of being chosen for reproduction directly related to fitness function. Selection: Random choice of two pairs Crossover: Offspring with mixture of code Mutation: Each location is subject to random mutation with a small independent probability.

15 Genetic algorithms in games
computationally expensive so primarily used as an offline form of learning Cloak, Dagger & DNA (Oidian Systems) 4 DNA strands defining opponent behaviour between battles, opponents play each other Creatures (Millennium Interactive) genetic algorithms to learning the weights in a neural network that defines behaviour Cloak, Dagger and DNA is one of the first games ever to use make use of genetic algorithms. Each player (human or computer) controls an army which can be split up into smaller units and has to try to conquer the playing field and its resource-generating factories in a Risk-style manner of play.

16 Online search All previous techniques have focused on offline reasoning (think first then act). Now we will briefly look at online search (think, act, think, act, ...) Advantageous in dynamic situations or those with only partial information. Can only be solved by an agent performing actions (not by some offline computational process).

17 “Real-time” search concepts
in A* the whole path is computed off-line, before the agent walks through the path this solution is only valid for static worlds if the world changes in the meantime, the initial path is no longer valid: new obstacles appear position of goal changes (e.g. moving target)

18 “Real-time” definitions
off-line (non real-time): the solution is computed in a given amount of time before being executed real-time: one move computed at a time, and that move executed before computing the next anytime: the algorithm constantly improves its solution through time capable of providing “current best” at any time

19 Agent-based (online) search
for example: mobile robot NPC without perfect knowledge agent that must act now with limited information planning and execution are interleaved could apply standard search techniques: best-first (but we know it is poor) depth-first (has to physically back-track) A* (but nodes in the fringe are not accessible) After each action, an online agent receives a percept telling it what state it has reached – it can augment its map of the environment. Must be physically there.

20 LRTA*: Learning Real-time A*
augment hill-climbing with memory store “current best estimate” follow path based on neighbours’ estimates update estimates based on experience experience  learning flatten out local maxima… At each time step: At node ‘a’, calculate the f-cost for each neighbour as: f(n) = cost(a->n) + h(n) Find the minimum neighbour b with the lowest f-cost: b = minn { f(n) } Update the estimate at node: h(a) = f(b) Move to neighbour: a = b

21 LRTA*: example 8 9 2 4 1 8 9 3 2 4 1 8 9 3 4 1 8 9 5 4 1 Circles are the states, number inside is the current h(n) heuristic value. Dark blue is current state, light blue is visited. At each time step: At node ‘a’, calculate the f-cost for each neighbour as: f(n) = cost(a->n) + h(n) Find the minimum neighbour b with the lowest f-cost: b = minn { f(n) } Update the estimate at node: h(a) = f(b) Move to neighbour: a = b 8 9 5 4 1

22 Learning real-time A* [Diagrams incorrect]
Start state lower left, goal lower right, initial heuristic value is Manhattan distance to goal. At each time step: At node ‘a’, calculate the f-cost for each neighbour as: f(n) = cost(a->n) + h(n) Find the minimum neighbour b with the lowest f-cost: b = minn { f(n) } Update the estimate at node: h(a) = f(b) Move to neighbour: a = b


Download ppt "Artificial Intelligence for Games Online and local search"

Similar presentations


Ads by Google