Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld…) 1.

Slides:

Advertisements

Similar presentations

Informed search algorithms

Advertisements

Local Search and Optimization

Local Search Algorithms

LOCAL SEARCH AND CONTINUOUS SEARCH. Local search algorithms  In many optimization problems, the path to the goal is irrelevant ; the goal state itself.

CSC344: AI for Games Lecture 5 Advanced heuristic search Patrick Olivier

Local search algorithms

Local search algorithms

Two types of search problems

Iterative improvement algorithms Prof. Tuomas Sandholm Carnegie Mellon University Computer Science Department.

Optimization via Search CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.

CS 460 Spring 2011 Lecture 3 Heuristic Search / Local Search.

CS 4700: Foundations of Artificial Intelligence

Trading optimality for speed…

Introduction to Artificial Intelligence Heuristic Search Ruth Bergman Fall 2002.

Review Best-first search uses an evaluation function f(n) to select the next node for expansion. Greedy best-first search uses f(n) = h(n). Greedy best.

CSC344: AI for Games Lecture 4: Informed search

Informed Search Chapter 4 Adapted from materials by Tim Finin, Marie desJardins, and Charles R. Dyer CS 63.

Beyond Classical Search (Local Search) R&N III: Chapter 4

Informed Search Next time: Search Application Reading: Machine Translation paper under Links Username and password will be mailed to class.

Optimization via Search CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.

1 CS 2710, ISSP 2610 R&N Chapter 4.1 Local Search and Optimization.

Local Search and Optimization

Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.

Informed Search Chapter 4 (b)

1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.

Search CSE When you can’t use A* Hill-climbing Simulated Annealing Other strategies 2 person- games.

INTRODUÇÃO AOS SISTEMAS INTELIGENTES Prof. Dr. Celso A.A. Kaestner PPGEE-CP / UTFPR Agosto de 2011.

An Introduction to Artificial Life Lecture 4b: Informed Search and Exploration Ramin Halavati In which we see how information.

Local Search Algorithms This lecture topic Chapter Next lecture topic Chapter 5 (Please read lecture topic material before and after each lecture.

Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.

Informed search algorithms

Informed search algorithms

1 Shanghai Jiao Tong University Informed Search and Exploration.

Informed search algorithms Chapter 4. Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most.

ISC 4322/6300 – GAM 4322 Artificial Intelligence Lecture 3 Informed Search and Exploration Instructor: Alireza Tavakkoli September 10, 2009 University.

Iterative Improvement Algorithm 2012/03/20. Outline Local Search Algorithms Hill-Climbing Search Simulated Annealing Search Local Beam Search Genetic.

Artificial Intelligence for Games Online and local search

Local Search Algorithms

Local Search Pat Riddle 2012 Semester 2 Patricia J Riddle Adapted from slides by Stuart Russell,

CHAPTER 4, Part II Oliver Schulte Summer 2011 Local Search.

Princess Nora University Artificial Intelligence Chapter (4) Informed search algorithms 1.

Local Search and Optimization Presented by Collin Kanaley.

When A* doesn’t work CIS 391 – Intro to Artificial Intelligence A few slides adapted from CS 471, Fall 2004, UBMC (which were adapted from notes by Charles.

4/11/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 4, 4/11/2005 University of Washington, Department of Electrical Engineering Spring 2005.

Feng Zhiyong Tianjin University Fall  Best-first search  Greedy best-first search  A * search  Heuristics  Local search algorithms  Hill-climbing.

Local search algorithms In many optimization problems, the state space is the space of all possible complete solutions We have an objective function that.

Optimization Problems

Announcement "A note taker is being recruited for this class. No extra time outside of class is required. If you take clear, well-organized notes, this.

Local Search. Systematic versus local search u Systematic search  Breadth-first, depth-first, IDDFS, A*, IDA*, etc  Keep one or more paths in memory.

Chapter 4 (Section 4.3, …) 2 nd Edition or Chapter 4 (3 rd Edition) Local Search and Optimization.

Chapter 5. Advanced Search Fall 2011 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

Lecture 6 – Local Search Dr. Muhammad Adnan Hashmi 1 24 February 2016.

Local Search Algorithms and Optimization Problems

CPSC 322, Lecture 16Slide 1 Stochastic Local Search Variants Computer Science cpsc322, Lecture 16 (Textbook Chpt 4.8) Oct, 11, 2013.

CPSC 420 – Artificial Intelligence Texas A & M University Lecture 5 Lecturer: Laurie webster II, M.S.S.E., M.S.E.e., M.S.BME, Ph.D., P.E.

Department of Computer Science Lecture 5: Local Search

Local Search Algorithms CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and.

CMPT 463. What will be covered A* search Local search Game tree Constraint satisfaction problems (CSP)

Constraints Satisfaction Edmondo Trentin, DIISM. Constraint Satisfaction Problems: Local Search In many optimization problems, the path to the goal is.

Local search algorithms In many optimization problems, the path to the goal is irrelevant; the goal state itself is the solution State space = set of "complete"

CSCI 4310 Lecture 10: Local Search Algorithms

Department of Computer Science

Local Search Algorithms

Artificial Intelligence (CS 370D)

Local Search and Optimization

Artificial Intelligence

More on HW 2 (due Jan 26) Again, it must be in Python 2.7.

More on HW 2 (due Jan 26) Again, it must be in Python 2.7.

Local Search Algorithms

Local Search Algorithms

Presentation transcript:

Local Search and Optimization Chapter 4 Mausam (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld…) 1

Outline Local search techniques and optimization – Hill-climbing – Gradient methods – Simulated annealing – Genetic algorithms – Issues with local search 2

Local search and optimization Previous lecture: path to goal is solution to problem – systematic exploration of search space. This lecture: a state is solution to problem – for some problems path is irrelevant. – E.g., 8-queens Different algorithms can be used – Depth First Branch and Bound – Local search 3

Goal Satisfaction Optimization reach the goal node Constraint satisfaction optimize(objective fn) Constraint Optimization You can go back and forth between the two problems Typically in the same complexity class © Mausam4

Local search and optimization Local search – Keep track of single current state – Move only to neighboring states – Ignore paths Advantages: – Use very little memory – Can often find reasonable solutions in large or infinite (continuous) state spaces. “Pure optimization” problems – All states have an objective function – Goal is to find state with max (or min) objective value – Does not quite fit into path-cost/goal-state formulation – Local search can do quite well on these problems. 5

Trivial Algorithms Random Sampling – Generate a state randomly Random Walk – Randomly pick a neighbor of the current state Both algorithms asymptotically complete. © Mausam6

Hill-climbing (Greedy Local Search) max version function HILL-CLIMBING( problem) return a state that is a local maximum input: problem, a problem local variables: current, a node. neighbor, a node. current  MAKE-NODE(INITIAL-STATE[problem]) loop do neighbor  a highest valued successor of current if VALUE [neighbor] ≤ VALUE[current] then return STATE[current] current  neighbor min version will reverse inequalities and look for lowest valued successor 7

Hill-climbing search “a loop that continuously moves towards increasing value” – terminates when a peak is reached – Aka greedy local search Value can be either – Objective function value – Heuristic function value (minimized) Hill climbing does not look ahead of the immediate neighbors Can randomly choose among the set of best successors – if multiple have the best value “climbing Mount Everest in a thick fog with amnesia” 8

“Landscape” of search Hill Climbing gets stuck in local minima depending on? 9

Example: n-queens Put n queens on an n x n board with no two queens on the same row, column, or diagonal Is it a satisfaction problem or optimization? 10

Hill-climbing search: 8-queens problem Need to convert to an optimization problem h = number of pairs of queens that are attacking each other h = 17 for the above state 11

Search Space State – All 8 queens on the board in some configuration Successor function – move a single queen to another square in the same column. Example of a heuristic function h(n): – the number of pairs of queens that are attacking each other – (so we want to minimize this) 12

Hill-climbing search: 8-queens problem Is this a solution? What is h? 13

Hill-climbing on 8-queens Randomly generated 8-queens starting states… 14% the time it solves the problem 86% of the time it get stuck at a local minimum However… – Takes only 4 steps on average when it succeeds – And 3 on average when it gets stuck – (for a state space with 8^8 =~17 million states) 14

15 Hill Climbing Drawbacks Local maxima Plateaus Diagonal ridges

Escaping Shoulders: Sideways Move If no downhill (uphill) moves, allow sideways moves in hope that algorithm can escape – Need to place a limit on the possible number of sideways moves to avoid infinite loops For 8-queens – Now allow sideways moves with a limit of 100 – Raises percentage of problem instances solved from 14 to 94% – However…. 21 steps for every successful solution 64 for each failure 16

Tabu Search prevent returning quickly to the same state Keep fixed length queue (“tabu list”) add most recent state to queue; drop oldest Never make the step that is currently tabu’ed Properties: – As the size of the tabu list grows, hill-climbing will asymptotically become “non-redundant” (won’t look at the same state twice) – In practice, a reasonable sized tabu list (say 100 or so) improves the performance of hill climbing in many problems 17

Escaping Shoulders/local Optima Enforced Hill Climbing Perform breadth first search from a local optima – to find the next state with better h function Typically, – prolonged periods of exhaustive search – bridged by relatively quick periods of hill-climbing Middle ground b/w local and systematic search © Mausam18

Hill-climbing: stochastic variations Stochastic hill-climbing – Random selection among the uphill moves. – The selection probability can vary with the steepness of the uphill move. To avoid getting stuck in local minima – Random-walk hill-climbing – Random-restart hill-climbing – Hill-climbing with both 19

Hill Climbing: stochastic variations  When the state-space landscape has local minima, any search that moves only in the greedy direction cannot be complete  Random walk, on the other hand, is asymptotically complete Idea: Put random walk into greedy hill-climbing 20

Hill-climbing with random restarts If at first you don’t succeed, try, try again! Different variations – For each restart: run until termination vs. run for a fixed time – Run a fixed number of restarts or run indefinitely Analysis – Say each search has probability p of success E.g., for 8-queens, p = 0.14 with no sideways moves – Expected number of restarts? – Expected number of steps taken? If you want to pick one local search algorithm, learn this one!! 21

Hill-climbing with random walk At each step do one of the two – Greedy: With prob p move to the neighbor with largest value – Random: With prob 1-p move to a random neighbor Hill-climbing with both At each step do one of the three – Greedy: move to the neighbor with largest value – Random Walk: move to a random neighbor – Random Restart: Resample a new current state © Mausam22

Simulated Annealing Simulated Annealing = physics inspired twist on random walk Basic ideas: – like hill-climbing identify the quality of the local improvements – instead of picking the best move, pick one randomly – say the change in objective function is  – if  is positive, then move to that state – otherwise: move to this state with probability proportional to  thus: worse moves (very large negative  ) are executed less often – however, there is always a chance of escaping from local maxima – over time, make it less likely to accept locally bad moves – (Can also make the size of the move random as well, i.e., allow “large” steps in state space) 23

Physical Interpretation of Simulated Annealing A Physical Analogy: imagine letting a ball roll downhill on the function surface – this is like hill-climbing (for minimization) now imagine shaking the surface, while the ball rolls, gradually reducing the amount of shaking – this is like simulated annealing Annealing = physical process of cooling a liquid or metal until particles achieve a certain frozen crystal state simulated annealing: – free variables are like particles – seek “low energy” (high quality) configuration – slowly reducing temp. T with particles moving around randomly 24

Simulated annealing function SIMULATED-ANNEALING( problem, schedule) return a solution state input: problem, a problem schedule, a mapping from time to temperature local variables: current, a node. next, a node. T, a “temperature” controlling the prob. of downward steps current  MAKE-NODE(INITIAL-STATE[problem]) for t  1 to ∞ do T  schedule[t] if T = 0 then return current next  a randomly selected successor of current ∆E  VALUE[next] - VALUE[current] if ∆E > 0 then current  next else current  next only with probability e ∆E /T 25

Temperature T high T: probability of “locally bad” move is higher low T: probability of “locally bad” move is lower typically, T is decreased as the algorithm runs longer i.e., there is a “temperature schedule” 26

Simulated Annealing in Practice – method proposed in 1983 by IBM researchers for solving VLSI layout problems (Kirkpatrick et al, Science, 220: , 1983). theoretically will always find the global optimum – Other applications: Traveling salesman, Graph partitioning, Graph coloring, Scheduling, Facility Layout, Image Processing, … – useful for some problems, but can be very slow slowness comes about because T must be decreased very gradually to retain optimality 27

Local beam search Idea: Keeping only one node in memory is an extreme reaction to memory problems. Keep track of k states instead of one – Initially: k randomly selected states – Next: determine all successors of k states – If any of successors is goal  finished – Else select k best from successors and repeat 28

Local Beam Search (contd) Not the same as k random-start searches run in parallel! Searches that find good states recruit other searches to join them Problem: quite often, all k states end up on same local hill Idea: Stochastic beam search – Choose k successors randomly, biased towards good ones Observe the close analogy to natural selection! 29

Hey! Perhaps sex can improve search? 30

Sure! Check out ye book. 31

Genetic algorithms Twist on Local Search: successor is generated by combining two parent states A state is represented as a string over a finite alphabet (e.g. binary) – 8-queens State = position of 8 queens each in a column Start with k randomly generated states (population) Evaluation function (fitness function): – Higher values for better states. – Opposite to heuristic function, e.g., # non-attacking pairs in 8-queens Produce the next generation of states by “simulated evolution” – Random selection – Crossover – Random mutation 32

String representation Can we evolve 8-queens through genetic algorithms? 33

Evolving 8-queens ? Sorry! Wrong queens 34

Genetic algorithms Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 = 28) 24/( ) = 31% 23/( ) = 29% etc 4 states for 8-queens problem 2 pairs of 2 states randomly selected based on fitness. Random crossover points selected New states after crossover Random mutation applied 35

Genetic algorithms Has the effect of “jumping” to a completely different new part of the search space (quite non-local) 36

Comments on Genetic Algorithms Genetic algorithm is a variant of “stochastic beam search” Positive points – Random exploration can find solutions that local search can’t (via crossover primarily) – Appealing connection to human evolution “neural” networks, and “genetic” algorithms are metaphors! Negative points – Large number of “tunable” parameters Difficult to replicate performance from one problem to another – Lack of good empirical studies comparing to simpler methods – Useful on some (small?) set of problems but no convincing evidence that GAs are better than hill-climbing w/random restarts in general 37

Optimization of Continuous Functions Discretization – use hill-climbing Gradient descent – make a move in the direction of the gradient gradients: closed form or empirical 38

39

Gradient Descent Assume we have a continuous function: f(x 1,x 2,…,x N ) and we want minimize over continuous variables X1,X2,..,Xn 1. Compute the gradients for all i:  f(x 1,x 2,…,x N ) /  x i 2. Take a small step downhill in the direction of the gradient: x i  x i - λ  f(x 1,x 2,…,x N ) /  x i 3. Repeat. How to select λ – Line search: successively double – until f starts to increase again 40

Newton-Raphson applied to function minimization Newton-Raphson method: roots of a polynomial – To find roots of g(x), start with x and iterate x  x – g(x)/g’(x) – To minimize a function f(x), we need to find the roots of the equation f’(x)=0 x  x – f’(x)/f’’(x) If x is a vector then – x  x – f’(x)/f’’(x) Equivalent to fitting a quadratic function for f in the local neighborhood of x.  f(x)H f (x) Hessian is costly to compute (n 2 double derivative entries for an n-dimensional vector)  approximations 41