1 Greedy Algorithms Bioinformatics Algorithms © Jeff Parker, 2009 Why should I care about posterity? What's posterity ever done for me? - Gourcho Marx.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Knowledge Representation
Advertisements

Review: Search problem formulation
Heuristic Search techniques
State Space Representation and Search
MATH 224 – Discrete Mathematics
Traveling Salesperson Problem
BackTracking Algorithms
Types of Algorithms.
Greed is good. (Some of the time)
§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
Problem Solving Agents A problem solving agent is one which decides what actions and states to consider in completing a goal Examples: Finding the shortest.
Solving Problems by Searching Currently at Chapter 3 in the book Will finish today/Monday, Chapter 4 next.
UNINFORMED SEARCH Problem - solving agents Example : Romania  On holiday in Romania ; currently in Arad.  Flight leaves tomorrow from Bucharest.
Artificial Intelligence Lecture No. 7 Dr. Asad Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Search in AI.
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
Review: Search problem formulation
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
MAE 552 – Heuristic Optimization Lecture 27 April 3, 2002
Problem Solving and Search in AI Heuristic Search
Review Best-first search uses an evaluation function f(n) to select the next node for expansion. Greedy best-first search uses f(n) = h(n). Greedy best.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
The Complexity of Algorithms and the Lower Bounds of Problems
Backtracking.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
MA/CSSE 473 Day 12 Insertion Sort quick review DFS, BFS Topological Sort.
Complexity of algorithms Algorithms can be classified by the amount of time they need to complete compared to their input size. There is a wide variety:
Fundamentals of Algorithms MCS - 2 Lecture # 7
1 CS 177 Week 16 Recitation Recursion. 2 Objective To understand and be able to program recursively by breaking down a problem into sub problems and joining.
Introduction to search Chapter 3. Why study search? §Search is a basis for all AI l search proposed as the basis of intelligence l inference l all learning.
Priority Queues and Binary Heaps Chapter Trees Some animals are more equal than others A queue is a FIFO data structure the first element.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Télécom 2A – Algo Complexity (1) Time Complexity and the divide and conquer strategy Or : how to measure algorithm run-time And : design efficient algorithms.
Chapter 11 Heap. Overview ● The heap is a special type of binary tree. ● It may be used either as a priority queue or as a tool for sorting.
CSC 211 Data Structures Lecture 13
Informed Search Strategies Lecture # 8 & 9. Outline 2 Best-first search Greedy best-first search A * search Heuristics.
For Friday Finish reading chapter 4 Homework: –Lisp handout 4.
For Monday Read chapter 4, section 1 No homework..
Lecture 3: Uninformed Search
Review 1 Arrays & Strings Array Array Elements Accessing array elements Declaring an array Initializing an array Two-dimensional Array Array of Structure.
1 Branch and Bound Searching Strategies Updated: 12/27/2010.
Algorithm Design Methods (II) Fall 2003 CSE, POSTECH.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Basic Problem Solving Search strategy  Problem can be solved by searching for a solution. An attempt is to transform initial state of a problem into some.
Goal-based Problem Solving Goal formation Based upon the current situation and performance measures. Result is moving into a desirable state (goal state).
Types of Algorithms. 2 Algorithm classification Algorithms that use a similar problem-solving approach can be grouped together We’ll talk about a classification.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
The Selection Algorithm : Design & Analysis [10].
Copyright © 2014 Curt Hill Algorithms From the Mathematical Perspective.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
Chapter 3 Solving problems by searching. Search We will consider the problem of designing goal-based agents in observable, deterministic, discrete, known.
Lecture 3: Uninformed Search
Traveling Salesperson Problem
Top 50 Data Structures Interview Questions
Backtracking And Branch And Bound
Week 11 - Friday CS221.
Backtracking And Branch And Bound
Types of Algorithms.
Algorithm design and Analysis
Types of Algorithms.
Applied Combinatorics, 4th Ed. Alan Tucker
Types of Algorithms.
Backtracking And Branch And Bound
CMSC 471 Fall 2011 Class #4 Tue 9/13/11 Uninformed Search
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

1 Greedy Algorithms Bioinformatics Algorithms © Jeff Parker, 2009 Why should I care about posterity? What's posterity ever done for me? - Gourcho Marx

2 Question from Last Week What are we supposed to take away from the discussion of Motif Finding? Objectives Understand the Motif Problem Understand Exhaustive Search Understand that some Exhaustive search is less exhausting than others Understand Branch and Bound

3 Outline Objectives Understand why we cannot use backtracking Understand what a Greedy Algorithm is Understand use of metrics Understand alternatives – BFS, Best First, and A* search Understand the biological problem Understand some algorithms to reverse sort What is a greedy algorithm? Understand why trying for local optimization can distort things Alternatives to Greed

4 Backtracking's Limits Backtracking can solve problems like the knights tour, It is not well suited for the 15 puzzle There is no way to characterize a position as a dead end. However, some positions are more promising than others The problem of sorting also has no dead ends

5 Definition of Greedy Algorithm Many algorithms make a sequence of choices among alternatives Sorting – which pair should we exchange? Traveling System – which city should we visit next? Greedy algorithms make a locally optimal choice in hopes that it will make a global optimal choice. That is, they look ahead one move. Sometimes a greedy algorithm is optimal. Often it is not.

6 Decision Trees We may have a sequence of decisions to make Should I file the short form or the long form? If I file the long form, should I fill out Schedule C? Note that the tree below is not a data structure: it is an expansion of the logic of an algorithm.

7 Example Consider a linear search We can view this as a sequence of decisions In searching an array of N items for item x, there are 2N+1 outcomes: the item should be Before the first Same value as the first After the first, but before the second and so on

8 Linear Search Linear_search ( int target, int list []) { for (int pos = 0; pos < MAX; pos++) { if ( target >= list[pos] ) break; } // Sort things out… if (pos < MAX) { if (target == list[pos]) update list[pos]; else insert before list[pos]; } else // pos == MAX insert after list[pos]; }

9 High Level View This gives a scrawny tree. The algorithm is Greedy How many comparisons are needed to reach each possible outcome? = 45 We are looking at the sum of the path lengths: gives a measure of the average complexity There are two forms: internal and external. Closely related Greedy algorithms work in phases. In each phase, a decision is made that appears to be good, without regard for future consequences.

10 Look at Binary Search Two possible versions of main loop /* Binary 1 - Forgetful Binary Search */ while ( top > bottom ) { middle = ( top + bottom ) / 2; if ( list[middle] < target ) bottom = middle + 1; else top = middle; } // then sort things out… /* Version 2 - Careful Binary Search - check middle entry */ while ( top > bottom ) { middle = ( top + bottom ) / 2; if ( list[middle] == target ) break; if ( list[middle] < target ) bottom = middle + 1; else /* Cut down search space */ top = middle - 1; } // Sort things out…

11 Which is better? Either version is better than linear search for more than a few items. To compare them, look at the average path length from the root to a decision We sum up the length of all paths again How many decisions to reach all outcomes? First look at Forgetful search = 39

12 Careful Binary Search Careful search has fewer recursive calls: each call does twice as much work =49 (For larger sets, does better than linear)

13 Balloon Dog Theorem When you put one outcome close to the root, may push others further away

14 Problem Our problem today is to find the smallest number of operations that will lead us from one sequence to another Basic operation is reversing a section of the sequence

15 Representation We will represent the sequence with signed integers One sequence (turnip below) is presented in order: goal

16 Representation We will represent the sequence with signed integers We can represent a sequence of signed integers as a sequence of unsigned integers Bracket the numbers with initial and final values that do not change We may compress an increasing or a decreasing run

17 Search At each step, we can reverse a subsequence The problem is to minimize the number of steps Exhaustive search would follow all possible outcomes This is simplest to organize as a BFS of the space This is called uninformed search Pay no attention to the contents To compare positions, need a metric

18 Breadth First Search He leapt onto his horse and galloped off madly in all directions. - Stephen Leacock Systematically search each alternative Look at all boards one move away Look at all boards two moves away To implement BFS, use a queue Take the next board from queue Look at all boards one move away Toss duplicates. Insert the rest in the priority queue

19 Metrics BFS takes too long for many problems. How do we decide which move is better? To measure progress, we use metrics. Traveling Salesman – cost of tour Bubble Sort – the length of the sorted subarray 15 puzzle – number of tiles that are home Greedy algorithm uses metric

20 Informed Search These searches are “informed” by a measure of how close a position is to a solution In so-called “Hill Climbing”, we follow the most promising path we can. While this can quickly lead to a better position, it often leaves us at a local max (min) The hill we climb is not always the highest We cannot always continue to increase (reduce) the metric

21 Our Greedy Strategy Our strategy will be a form of Depth First Search At each stage, we will select the most promising next step Since there are no dead-ends, there is always hope We need a metric. How do we decide which permutation is close to solved?

22 Metric 1 First metric: Length of run of items in order Can increase this at each step Compare with

23 Metric 2 Look at breakpoints – places where abs(a i – a i+1 ) !=

24 Finding Breakpoints # Look at breakpoints – places where abs(a i – a i+1 ) != 1 def findBP(ar): """Find the breakpoints in a string of integers""" lst = [] for i in xrange(1, len(ar)): if (abs(ar[i-1] - ar[i]) > 1): lst.append(i) return lst

25 Reversing String Segment def reverseSegment(ar, strt, end): """Take an array ar, and reverse the segment ar[strt:end]""" sublst = ar[strt:end] # Now reverse the stack using Python idiom sublst = sublst[::-1] # Print the list as the three components print ar[:strt], sublst, ar[end:], # We print the number of breakpoints in the caller return ar[:strt] + sublst + ar[end:]

26 Sorting def sortPermutation(ar): """Greedy algorithm to sort a permutation.""" bp = findBP(ar) while (len(bp) > 0): bpLen = len(bp) bpMin = len(ar) minAr = [] # Look at all possible reversals

27 All possible Reversals for i in xrange(bpLen): for j in xrange(i+1, bpLen): if (bp[i] < bp[j] - 1): cand = reverseSegment(ar, bp[i], bp[j]) candBP = findBP(cand) candBPLen = len(candBP) if (candBPLen < bpMin): bpMin = candBPLen minAr = cand

28 Output List [0, 6, 1, 2, 5, 4, 9, 7, 8, 10, 3, 11] BPs [1, 2, 4, 6, 7, 9, 10, 11] 8 [0] [2, 1, 6] [5, 4, 9, 7, 8, 10, 3, 11] w/ breakpoint count 7 [0] [4, 5, 2, 1, 6] [9, 7, 8, 10, 3, 11] w/ breakpoint count 8 … [0, 6] [2, 1] [5, 4, 9, 7, 8, 10, 3, 11] w/ breakpoint count 8 [0, 6] [4, 5, 2, 1] [9, 7, 8, 10, 3, 11] w/ breakpoint count 8 … [0, 6, 1, 2, 5, 4] [3, 10, 8, 7, 9] [11] w/ breakpoint count 7 [0, 6, 1, 2, 5, 4, 9] [8, 7] [10, 3, 11] w/ breakpoint count 7

29 Termination It is not always possible to find a move that lowers the breakcount (Always have some move that leaves it fixed) How do we know that this will terminate?

30 Informed Search: Best First Keep a table of positions we have already seen Insert starting position in PQueue and table While PQueue is not empty Select position from the PQueue While there are moves from here Generate next position If position is not in table Insert in the PQueue We investigate multiple strands at the same time Like breadth first search, but informed by our notion of closeness. By placing the positions in a priority queue, we look at the most promising positions first

31 Best First algorithm in action Take the best position from priority queue Look at all boards one move away For each position, check to see if we have seen it before If not, insert in the priority queue Rank boards by their distance from a solution We don't know how far it really is: We use our estimate h*(b)

32 Best First in action Start with the center position (1) Generate all outcomes: discard boards we have seen before Place remaining outcomes in the priority queue We select one of the cheapest (2) Generate outcomes: toss duplicates Select the new cheapest (3) Note that 1, 2, 3 do not form a legal sequence of moves

33 A* Search BFS finds the minimal solution, but it takes a long time. Best First uses function h*(b). Faster, but solution may not be the best. A* is an informed search that will find an optimal solution. One way to improve things it to improve h*(b). Often difficult Define a new priority function f*, where f*(b) = g*(b) + h*(b) where g*(b) is the best estimate of the number of steps required to reach this position. Breadth First Search amounts to f*(b) = g*(b) and Best First Search amounts to f*(b) = h*(b)

34 A* Search Use our new priority function f*, where f*(b) = g*(b) + h*(b) where g*(b) is the best estimate of the number of steps required to reach this position. Why do we need to estimate g*(b)? Don't we know how long it took? Shortcuts: You may find you can reach a position that took you 20 steps through another path that only takes 16 steps. When you find a better path, update the stored board to point to the new, better, solution. (Though this will happen with A* search, it will never happen with BFS. Why?) Requeue the board at the new priority Not all implementations of PQ have an easy way to update costs But it turns out there is no harm done if you have multiple copies of a board in the PQ.