Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations.

Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations in Computing: An Introduction to Computer Science, by John S. Conery. These slides are provided free of charge to instructors who are using the textbook for their courses. Instructors may alter the slides for use in their own courses, including but not limited to: adding new slides, altering the wording or images found on these slides, or deleting slides. Instructors may distribute printed copies of the slides, either in hard copy or as electronic copies in PDF form, provided the copyright notice below is reproduced on the first slide. This Keynote document contains the slides for “A Journey of a Thousand Miles”, Chapter 4 of Explorations in Computing: An Introduction to Computer Science. The book invites students to explore ideas in computer science through interactive tutorials where they type expressions in Ruby and immediately see the results, either in a terminal window or a 2D graphics window. Instructors are strongly encouraged to have a Ruby session running concurrently with Keynote in order to give live demonstrations of the Ruby code shown on the slides.

Explorations in Computing © 2012 John S. Conery Iteration as a strategy for solving computational problems A Journey of a Thousand Miles ✦ Searching and Sorting ✦ The Linear Search Algorithm ✦ The Insertion Sort Algorithm ✦ Scalability ✦ Best Case, Worst Case

About the Title ✦ The chapter title is part of a quote from Tao Te Ching, by Lao-Tzu: “A journey of a thousand miles begins with a single step.”

Search ✦ Searching is a common operation in many different applications ❖ in iTunes, enter a string in the “search box” ❖ the program shows tunes that have that string in the song title, album title, artist name, or other fields

Search (cont’d) ✦ Some other examples: ❖ on-line dictionaries and catalogs ❖ the “find” command in a word processor or text editor Dictionary.app on Mac OS X systems Find command in TextMate

Defining Search ✦ The examples on the previous slides were similar to searches in everyday life ❖ look for a book, either on a bookshelf at home or in a library ❖ find a name in a phone book or a word in a dictionary ❖ search a file drawer to find customer information or student records ✦ What these problems have in common: ❖ we have a large collection of items ❖ we need to search the collection to find a single item that matches a certain condition (e.g. name of book, name of a person)

Other Kinds of Searches ✦ Not all searches fit this “paradigm” of looking for a specified item ✦ Perform a search using different information ❖ files may be organized by student name, but we need to search by file contents, e.g. find students whose advisor is Prof. X ✦ Search for items matching a general description ❖ select all customer records where the customer’s name starts with “A” ❖ find all customers with overdue accounts ✦ “I’ll know it when I see it” searches ❖ look for a good book to read (bookcase, bookstore, library,...) ❖ scan a menu at a restaurant

Search Algorithms ✦ As in real life, there are many variations and many different types of search algorithms ❖ search a list to see if it contains a specified number ❖ search a list to find the largest number ❖ find all numbers in a list that start with 541 ❖ find strings of digits that match a pattern (e.g. xxx-xx-xxxx) ❖ search an abstract “space” of solutions (e.g. best move in chess or shortest tour) ❖ machines can be taught to scan images for interesting items (no exact representation of the item to search for) ❖ “I’ll know it when I see it” -- “data mining” for interesting or unusual patterns solarsystem.nasa.gov

Sorting ✦ Sorting -- reorganizing information so it’s in a particular order -- is closely relating to searching Click a column name to sort by that field

Iterative Algorithms ✦ The goal for this chapter: study two new algorithms based on iteration ✦ Build on the main idea from Chapter 3: ❖ repeating (iterating) a series of small steps can lead to the solution of important problems ✦ Search algorithm: linear search ✦ Sort algorithm: insertion sort ✦ The next chapter looks at more sophisticated techniques for both searching and sorting

Linear Search ✦ Linear search is the simplest, most straightforward search strategy ✦ As the name implies, the idea is to start at the beginning of a collection and compare items one after another ✦ Some terminology: ❖ the item we are looking for is known as the key ❖ this type of search is also sometimes called a scan ❖ if the key is not found the search fails

Linear Search in Ruby ✦ In Chapter 3 we saw that Ruby uses arrays to hold collections of data >> a = [8, 0, 9, 2, 7, 5, 3] => [8, 0, 9, 2, 7, 5, 3] >> a.class => Array ✦ The include? method does a linear search to see if an item is in an array >> a.include?(7) => true >> a.include?(4) => false

Array Indices ✦ In many situations we want to know where an item was found ✦ Instead of simply returning “true” or “false” we want a method to say something like “the thing you are looking for is in the second location in the array” ❖ an array location is known as an address or an index ✦ Computer scientists start labeling at 0 ❖ if an array has n items the addresses are 0 to n - 1

Index Expressions ✦ We can access any item in an array using an index expression ❖ if a is an array the expression a[i] means “the item at location i in the array” ❖ pronounced “a sub i”, from the mathematical notation a i >> a = ["apple", "lime", "kiwi", "orange", "ugli"] >> a[3] => "orange" >> a[3] = "tangerine" => "tangerine" >> a => ["apple", "lime", "kiwi", "tangerine", "ugli"] >> a[5] => nil 012345

Index Expressions (cont’d) ✦ An index expression gives us an alternative syntax for accessing the items at the front or end of an array >> a = ["apple", "lime", "kiwi", "orange", "ugli"] >> a.first => "apple" >> a[0] => "apple" >> a.last => "ugli" >> a[-1] => "ugli" >> a[a.length-1] => "ugli"

The index Method ✦ If a is an array object, we can call a method named index to do a search and return the location of the item >> a => ["apple", "lime", "kiwi", "orange", "ugli"] >> a.index("kiwi") => 2 >> a.index("banana") => nil ✦ Other types of objects have index methods, too: >> s = "supercalifragilisticexpialidocious" >> s.index("list") => 14

A Project ✦ To investigate the linear search algorithm, we’ll write our own versions of the include? and index methods contains?(a,x) ❖ return true if array a contains item x ❖ return false if x is not in a search(a,x) ❖ return the location of item x if x is in a ❖ return nil if x is not in a

Conditional Expressions ✦ A useful construct in Ruby is a conditional expression ✦ One way to write a conditional expression is to attach a modifier to a statement ❖ the modifier consists of the keyword if followed by a Boolean expression ✦ There are many ways to write conditions -- we’ll see others later >> a.each { |x| puts x }8092753 => [8, 0, 9, 2, 7, 5, 3] >> a.each { |x| puts x if x % 2 == 0 }802=> [8, 0, 9, 2, 7, 5, 3]

Writing the contains? Method ✦ To implement our contains? method we can use each to iterate over the array ❖ each will visit every item, in order ❖ put a return statement in the block executed by each ❖ attach a modifier so the return is executed as soon as the item is found def contains?(a,k) a.each { |x| return true if x == k } return false end break out of the loop early...... if the key matches an item in the array the array to search and the key to look for execute this statement only if each gets through the entire array without finding x

contains? (cont’d) ✦ Let’s try it out on an array of strings >> include IterationLab => Object >> a = [8, 0, 9, 2, 7, 5, 3] => [8, 0, 9, 2, 7, 5, 3] >> contains?(a, 7) => true >> contains?(a, 4) => false

Writing the search Method ✦ We want our search method to return the location of the match ✦ One way to write search: use an iterator named each_with_index ✦ For this project, though, we’ll write it using a while loop ❖ easier to attach a probe to count steps ❖ reuse the outline later, for insertion sort ✦ Optional project: ❖ get a copy of the source for contains? ❖ save it in a file named search.rb ❖ learn about each_with_index, use it to implement search >> Source.checkout("contains?", "search.rb")Saved a copy of source in search.rb=> true

search (cont’d) ✦ The basic plan is to use a variable named i to hold an index value ✦ Use a while loop to compare a[i] with the item we’re looking for ❖ return i if a[i] matches the item ❖ otherwise add 1 to i and repeat while i < a.length return i if a[i] == k i += 1end Note: i += 1 means “add 1 to i”

search (cont’d) ✦ Here is a listing of the complete method ❖ the while loop from the previous slide is on lines 3 to 6 >> Source.listing("search") 1: def search(a, k) 2: i = 0 3: while i true i will (if necessary) have every value from 0 to a.length-1 notice return values: either i or nil

search (cont’d) ✦ Trying out the method: >> a = TestArray.new(5, :colors) => ["tan", "mint", "salmon", "chocolate", "wheat"] >> search(a, "salmon") => 2 >> search(a, "orange") => nil Other options are :cars, :fruit, :words, :elements TestArray is a class in the RubyLabs module

Performance ✦ How many comparisons will the linear search algorithm make as it searches through an array with n items? ❖ another way to phrase it: how many iterations will our Ruby method make? ✦ For an unsuccessful search: ❖ compare every item before returning false or nil ❖ i.e. make n comparisons ✦ For a successful search, anywhere between 1 and n ❖ search may get lucky and find the item in the first location ❖ similarly, it might be in the last location ❖ expect, on average: n / 2 comparisons

Experiments with search ✦ We can attach a probe to one of the lines in the method to show the state of the search at each step ✦ A method named brackets will print an array, putting [ ] around some of the items ❖ brackets(a,i) makes a string showing every item in a ❖ puts [ before a[i] and ] at the end >> a => ["tan", "mint", "salmon", "chocolate", "wheat"] >> puts brackets(a, 2) tan mint [salmon chocolate wheat]

Experiments with search (cont’d) ✦ From the listing on a previous slide: 4: return i if a[i] == k ✦ Attach a probe to line 4, telling Ruby to print brackets around the region that has not been searched yet: >> Source.probe( "search", 4, "puts brackets(a,i)" ) => true >> trace { search(a, "chocolate") } [tan mint salmon chocolate wheat] tan [mint salmon chocolate wheat] tan mint [salmon chocolate wheat] tan mint salmon [chocolate wheat] => 3 Can you see the unsearched region getting smaller at each step? Do you see why the algorithm terminated when it did? Do you see why the result of calling search(a,"chocolate") is 3?

Experiments with search (cont’d) ✦ Here is the trace of an unsuccessful search: >> trace { search(a, "orange") } [tan mint salmon chocolate wheat] tan [mint salmon chocolate wheat] tan mint [salmon chocolate wheat] tan mint salmon [chocolate wheat] tan mint salmon chocolate [wheat] => nil ✦ If you want to do some more experiments: >> a = RandomArray.new(:cars, 10) => ["saturn", "ferrari", "bmw",... "honda"] How many comparisons will be made in an unsuccessful search of an array with n items?

Try Your Own Experiments ✦ If you don’t specify a type of string in a call to TestArray.new you get an array of numbers: >> a = TestArray.new(10) => [27, 5, 33, 39, 12, 51, 19, 20, 64, 9] ✦ Call a.random to get a random number from a TestArray ❖ a.random(:success) will return a number guaranteed to be in a ❖ a.random(:fail) will return a number that is not in a ✦ Optional project: ❖ attach a counter on line 4 (pass :count to the method that sets a probe) ❖ make TestArrays of varying size (100, 1000, or even 10,000 numbers) ❖ how many comparisons are made by an unsuccessful search? ❖ how many (on average) by a successful search? count { search(a, a.random(:success)) }

Another Optional Project ✦ Write a method named max that will find the largest item in an array ❖ here is an outline to get you started -- fill in the parts indicated by question marks ❖ there will be more than one line in the body of the loop def max(a) x = a[0] i = 1 while ?? ?? end return x end This method is another “variation on the theme” of linear search x will be the largest item seen so far On each iteration update x to be the current item if that item is greater than x Note: call Source.checkout(“search”,”max.rb”) to get a copy of the search method to use as a template for your method

Recap: Linear Search ✦ The linear search algorithm looks for an item in an array ❖ start at the beginning (a[0], or “the left”) ❖ compare each item, moving systematically to the right (i += 1) ✦ Variations: ❖ return true as soon as the item is found ❖ return the location of the item as soon as it is found ❖ scan all items to look for the largest ✦ Performance when searching an array of n items: ❖ do n comparisons when the search is unsuccessful ❖ expect an average of n/2 comparisons for a successful search

Sorting ✦ The search algorithms shown on the previous slides are examples of linear algorithms ❖ start at the beginning of a collection ❖ systematically progress through the collection, all the way to the end if necessary ✦ A similar strategy can be used to sort the items in an array ✦ The next set of slides will introduce a very simple sorting algorithm known as insertion sort Basic idea: pick up an item, find the place it belongs, insert it back into the array Move to the next item and repeat

Insertion Sort ✦ The important property of the insertion sort algorithm: at any point in this algorithm part of the array is already sorted ✦ The item we currently want to find a place for will be called the key ❖ items to the left of the key are already sorted ❖ the goal on each iteration is to insert the key at its proper place in the sorted part ✦ Example (shown at right): ❖ when it is time to find a place for the J in this hand the portion to the left is sorted

Insertion Sort ✦ Here is a more precise statement of the insertion sort algorithm 1. the initial key is the second item in the array (the Q in this example) 2. use your left hand to pick up the key 3. scan left until you find an item lower than the one in your left hand, or the front of the array, whichever comes first 4. insert the key back into the array at this location 5. the new key is the item to the right of the location of the previous key 6. go back to step 2 ✦ This new version is precise enough that we can organize a Ruby method that will implement this algorithm

Insertion Sort in Ruby ✦ Here is the insertion sort method, with a few things still left to fill in: def isort(a) i = 1 while i < a.length key = a[i] remove key from a j = location for key insert key at a[j+1] end return a end Example: when i = 3 key will be “J” j will be set to 0 (since a[0] = “9” is the first location with a card smaller than “J”) “J” will be inserted at location 1 A description like this that is part Ruby and part English is known as pseudocode

Pseudocode vs Real Code ✦ When algorithms require more than a few lines of Ruby we will use pseudocode in the lecture slides ✦ The algorithms have all been fully implemented in a RubyLabs module ❖ you can call Source.listing or Source.checkout if you want to see the “gory details” >> Source.listing("isort") 1: def isort(array) 2: a = array.clone # make a copy of the input 3: i = 1 4: while i true

Insertion Sort Example ✦ The following pictures show an example of how insertion sort works (using a list of numbers instead of cards) ❖ when i is 3 the positions to the left (0 through 2) have been sorted ❖ the first statements in the body of the loop set key to 5 and remove it from a

Insertion Sort Example (cont’d) ❖ the algorithm looks to the left for a location to put the 5 ❖ j is set to 0 since a[0] is the first item smaller than 5 ❖ on the next iteration, i will be 4, and the sorted region has grown by 1 to include all items in locations 0 to 3

Helper Method ✦ The operations of removing the item from a[i], scanning left, and re-inserting are implemented in a method called insert_left ❖ programmers call these special-purpose methods “helper methods” ❖ not intended to be used on its own, but only during a sort ✦ You do not need to understand the details of this code def insert_left(a, i) x = a.slice!(i) # remove a[i] j = i-1 # scan from here while j >= 0 && less(x, a[j]) j = j-1 end a.insert(j+1, x) # insert x end def insert_left(a, i) x = a.slice!(i) # remove a[i] j = i-1 # scan from here while j >= 0 && less(x, a[j]) j = j-1 end a.insert(j+1, x) # insert x end The “big picture”: a call to insert_left(a,i) moves a[i] somewhere to the left this method uses iteration -- in fact it’s a type of linear search

Nested Loops ✦ At first glance it might seem that insertion sort is a “linear” algorithm like search and max ❖ it has a while loop that progresses through the array from left to right ✦ But it’s important to note what is happening in insert_left ❖ the step that finds the proper location for the current item is also a loop ❖ it scans left from location i, going all the way back to 0 if necessary ✦ An algorithm with one loop inside another is said to have nested loops def isort(a) i = 1 while i = 0 && less(x,a[i])... end... end return aend def isort(a) i = 1 while i = 0 && less(x,a[i])... end... end return aend The while loop in insert_left

Nested Loops (cont’d) ✦ The diagram at right helps visualize how many comparisons are in isort ❖ a dot in a square indicates a potential comparison ❖ for any value of i, the inner loop might have to compare key to values from i - 1 all the way down to 0 ✦ The number of dots in this diagram is ✦ In general, for an array with n items, the potential number of comparisons is “Snap off” the empty top row, leaving n*(n-1) cells, half of which contain dots...

Experiments with isort ✦ We can use the brackets method to watch the progress of isort ✦ From the listing: 4: while i < a.length 5: insert_left(a, i) # find a place for a[i] 6: i += 1 7: end ✦ Attach a probe on line 5, to tell Ruby to print brackets around part of a just before calling insert_left: >> Source.probe("isort", 5, "puts brackets(a,i)") => true

Experiments with isort (cont’d) ✦ Using trace to call isort to sort the array shown on the previous slides: >> a => [0, 8, 9, 5, 7, 2, 3] >> trace { isort(a) } 0 [8 9 5 7 2 3] 0 8 [9 5 7 2 3] 0 8 9 [5 7 2 3] 0 5 8 9 [7 2 3] 0 5 7 8 9 [2 3] 0 2 5 7 8 9 [3] => [0, 2, 3, 5, 7, 8, 9] As in search, the left bracket moves steadily to the right (i increases by 1 on each iteration) Can you see how the part to the left of i is sorted? Can you see that the sorted portion grows on each iteration, until finally the entire array is sorted?

Try Your Own Experiments ✦ The isort method works for strings, too: >> a = TestArray.new(5, :cars) => ["citroen", "infiniti", "acura", "rolls-royce", "opel"] >> trace { isort(a) } citroen [infiniti acura rolls-royce opel] citroen infiniti [acura rolls-royce opel] acura citroen infiniti [rolls-royce opel] acura citroen infiniti rolls-royce [opel] => ["acura", "citroen", "infiniti", "opel", "rolls-royce"] ✦ It might be easier to see if you use arrays of numbers >> a = TestArray.new(10) => [60, 49, 50, 29, 30, 25, 42, 31, 48, 65]

Counting Comparisons ✦ The helper method that compares two items is named less >> Source.listing("less") 1: def less(x, y) 2: return x < y 3: end => true ✦ Attach a counting probe to line 2: >> Source.probe("less", 2, :count) => true

Counting Comparisons (cont’d) ✦ Count the comparisons made when sorting the array of car names: >> count { isort(a) } => 6 ✦ Does that seem right? >> trace { isort(a) } citroen [infiniti acura rolls-royce opel] citroen infiniti [acura rolls-royce opel] acura citroen infiniti [rolls-royce opel] acura citroen infiniti rolls-royce [opel]=> ["acura", "citroen", "infiniti", "opel", "rolls-royce"] ➀ ➁➂ ➃ ➄➅

Counting Comparisons (cont’d) ✦ There are 5 items in the array of cars ✦ The algorithm will do 4 iterations, so there will be a minimum of 4 comparisons ✦ The maximum number of comparisons is ✦ 6 is somewhere in between these two extremes

Challenge ✦ Explain how many comparisons will be made in the general case (for any arbitrary size array) when isort is passed an array that is already sorted: >> a = TestArray.new(100) => [142, 617, 826,... 949, 550] >> count { isort(a.sort) } => ?? >> count { isort(a.sort.reverse) } => ?? You can predict exactly how many comparisons will be made without running isort The challenge is to explain why the predicted number of comparisons will be made

✦ The formula for the worst case number of comparisons is ✦ For small arrays we should probably compute the exact answer: ✦ For large arrays the n doesn’t add very much, and we can get a good estimate by just computing Estimating the Number of Comparisons

✦ Because the term in the equation is the dominant term when n is large, we can use it to estimate the number of comparisons ✦ Computer scientists use the notation to mean “for large n the number of comparisons will be roughly ” ❖ pronounced “oh of n-squared” ❖ or sometimes “big oh of n-squared” ✦ There is a precise definition of what it means for an algorithm to be or ❖ for our purpose we’ll just use the notation informally ❖ for isort, the notation means “on the order of comparisons” Big-Oh Notation

Scalability ✦ The fact that the number of comparisons grows as the square of the array size may not seem important ❖ for small to moderate size arrays it is not a big deal ❖ but execution time will start to be a factor for larger arrays We’ll revisit this idea after looking at more sophisticated sorting algorithms in the next chapter

Recap: Insertion Sort ✦ The insertion sort algorithm is another example of iteration ✦ It uses nested loops -- one loop inside another one ✦ The outer loop has the same structure as the iteration in linear search ❖ an array index i ranges from 1 up to n-1 ❖ at any time, the items to the left of i are sorted ❖ the inner loop moves a[i] to its proper location in the sorted region ❖ the size of the sorted region grows on each iteration ✦ The number of comparison can be as small as n-1 ✦ In the worst case the number of comparisons is roughly

Summary ✦ This set of slides introduced two new algorithms based on iteration ✦ Simple linear search scans an array from left to right ✦ A straightforward sorting algorithm known as insertion sort also involves a scan from left to right ❖ an important difference: there is a second loop inside the main loop ❖ the inner loop scans back from right to left to find the place for an item ✦ New “technology” introduced for this topic: ❖ array index ❖ conditional execution (modifiers with the keyword if ) ❖ pseudocode

Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations.

Similar presentations

Presentation on theme: "Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations.

Similar presentations

Presentation on theme: "Slides for Chapter 4 Note to Instructors License © 2012 John S. Conery The slides in this Keynote document are based on copyrighted material from Explorations."— Presentation transcript:

Similar presentations

About project

Feedback