1 CSC 222: Computer Programming II Spring 2005 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search  example: spell checker.

Slides:



Advertisements
Similar presentations
1 CSC 427: Data Structures and Algorithm Analysis Fall 2008 Inheritance and efficiency  ArrayList  SortedArrayList  tradeoffs with adding/searching.
Advertisements

Problem Solving 5 Using Java API for Searching and Sorting Applications ICS-201 Introduction to Computing II Semester 071.
Copyright © 2014, 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Eighth Edition by Tony Gaddis,
Chapter 9: Searching, Sorting, and Algorithm Analysis
CompSci Searching & Sorting. CompSci Searching & Sorting The Plan  Searching  Sorting  Java Context.
1 CSC 427: Data Structures and Algorithm Analysis Fall 2006 Inheritance and efficiency  ArrayList  SortedArrayList  tradeoffs with adding/searching.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Algorithm Efficiency and Sorting
1 Searching Algorithms Overview  What is Searching?  Linear Search and its Implementation.  Brief Analysis of Linear Search.  Binary Search and its.
Searching Chapter Chapter Contents The Problem Searching an Unsorted Array Iterative Sequential Search Recursive Sequential Search Efficiency of.
Search Lesson CS1313 Spring Search Lesson Outline 1.Searching Lesson Outline 2.How to Find a Value in an Array? 3.Linear Search 4.Linear Search.
Chapter 8 ARRAYS Continued
Abstract Data Types (ADTs) Data Structures The Java Collections API
Searching – Linear and Binary Searches. Comparing Algorithms Should we use Program 1 or Program 2? Is Program 1 “fast”? “Fast enough”? P1P2.
1 CSC 222: Computer Programming II Spring 2004 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search Class design  templated.
Dictionaries CS 105. L11: Dictionaries Slide 2 Definition The Dictionary Data Structure structure that facilitates searching objects are stored with search.
1 CSC 222: Object-Oriented Programming Spring 2012 Lists, data storage & access  ArrayList class  methods: add, get, size, remove, contains, set, indexOf,
Week 11 Introduction to Computer Science and Object-Oriented Programming COMP 111 George Basham.
CPT: Search/ Computer Programming Techniques Semester 1, 1998 Objectives of these slides: –to discuss searching: its implementation,
1 CSC 222: Object-Oriented Programming Spring 2013 Java interfaces & polymorphism  Comparable interface  defining an interface  implementing an interface.
(C) 2010 Pearson Education, Inc. All rights reserved. Java How to Program, 8/e.
Chapter 19 Searching, Sorting and Big O
1 CSC 222: Object-Oriented Programming Spring 2012 Searching and sorting  sequential search  algorithm analysis: big-Oh, rate-of-growth  binary search.
1 CSC 222: Computer Programming II Spring 2004 Pointers and linked lists  human chain analogy  linked lists: adding/deleting/traversing nodes  Node.
1 CSC 222: Object-Oriented Programming Spring 2012 recursion & sorting  recursive algorithms  base case, recursive case  silly examples: fibonacci,
1 CSC 221: Computer Programming I Spring 2008 ArrayLists and arrays  example: letter frequencies  autoboxing/unboxing  ArrayLists vs. arrays  example:
Today  Table/List operations  Parallel Arrays  Efficiency and Big ‘O’  Searching.
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
 2005 Pearson Education, Inc. All rights reserved Searching and Sorting.
 Pearson Education, Inc. All rights reserved Searching and Sorting.
Searching. RHS – SOC 2 Searching A magic trick: –Let a person secretly choose a random number between 1 and 1000 –Announce that you can guess the number.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
DATA STRUCTURE & ALGORITHMS (BCS 1223) CHAPTER 8 : SEARCHING.
CS 162 Intro to Programming II Searching 1. Data is stored in various structures – Typically it is organized on the type of data – Optimized for retrieval.
L. Grewe.  An array ◦ stores several elements of the same type ◦ can be thought of as a list of elements: int a[8]
1 CSC 221: Computer Programming I Fall 2004 Lists, data access, and searching  ArrayList class  ArrayList methods: add, get, size, remove  example:
1 CSC 221: Computer Programming I Spring 2008 Lists, data storage & access  ArrayList class  methods: add, get, size, remove, contains, set, indexOf,
ADSA: IntroAlgs/ Advanced Data Structures and Algorithms Objective –introduce algorithm design using basic searching and sorting, and remind.
CSC 211 Data Structures Lecture 13
1 CSC 427: Data Structures and Algorithm Analysis Fall 2008 Algorithm analysis, searching and sorting  best vs. average vs. worst case analysis  big-Oh.
Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy Walters, and Godfrey Muganda Modified for use by MSU Dept. of Computer Science.
Topic 1 Object Oriented Programming. 1-2 Objectives To review the concepts and terminology of object-oriented programming To discuss some features of.
1 CSC 222: Object-Oriented Programming Spring 2013 Searching and sorting  sequential search vs. binary search  algorithm analysis: big-Oh, rate-of-growth.
CS261 Data Structures Ordered Bag Dynamic Array Implementation.
Review 1 Arrays & Strings Array Array Elements Accessing array elements Declaring an array Initializing an array Two-dimensional Array Array of Structure.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
1 CSC 221: Computer Programming I Fall 2006 Lists, data storage & access  ArrayList class  methods: add, get, size, remove, contains, set, indexOf, toString.
Searching and Sorting Searching: Sequential, Binary Sorting: Selection, Insertion, Shell.
Binary search and complexity reading:
© 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. Data Structures for Java William H. Ford William R. Topp Chapter 4 Introduction.
1 CSC 421: Algorithm Design and Analysis Spring 2014 Brute force approach & efficiency  league standings example  KISS vs. generality  exhaustive search:
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Searching Topics Sequential Search Binary Search.
Dictionaries CS 110: Data Structures and Algorithms First Semester,
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy.
CSC 427: Data Structures and Algorithm Analysis
CSC 427: Data Structures and Algorithm Analysis
16 Searching and Sorting.
CSC 222: Object-Oriented Programming
CSC 427: Data Structures and Algorithm Analysis
Chapter 5 Ordered List.
CSC 222: Object-Oriented Programming
CSC 222: Object-Oriented Programming Fall 2017
Building Java Programs
Chapter 5 Ordered List.
CSE 143 Lecture 5 Binary search; complexity reading:
Building Java Programs
CSC 221: Computer Programming I Fall 2009
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
Presentation transcript:

1 CSC 222: Computer Programming II Spring 2005 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search  example: spell checker

2 Searching a list suppose you have a list, and want to find a particular item, e.g.,  lookup a word in a dictionary  find a number in the phone book  locate a student's exam from a pile searching is a common task in computing  searching a database  checking a login password  lookup the value assigned to a variable in memory if the items in the list are unordered (e.g., added at random)  desired item is equally likely to be at any point in the list  need to systematically search through the list, check each entry until found  sequential search

3 Sequential search sequential search traverses the list from beginning to end  check each entry in the list  if matches the desired entry, then FOUND (return its index)  if traverse entire list and no match, then NOT FOUND (return -1) recall: the ArrayList class has an indexOf method /** * Performs sequential search on the array field named items desired item to be searched for index where desired first occurs, -1 if not found */ public int indexOf(Object desired) { for(int k=0; k < items.length; k++) { if (desired.equals(items[k])) { return k; } return -1; }

4 How efficient is sequential search? in the worst case:  the item you are looking for is in the last position of the list (or not found)  requires traversing and checking every item in the list  if 100 or 1,000 entries  NO BIG DEAL  if 10,000 or 100,000 entries  NOTICEABLE for this algorithm, the dominant factor in execution time is checking an item  the number of checks will determine efficiency in the average case? in the best case?

5 Big-Oh notation to represent an algorithm’s performance in relation to the size of the problem, computer scientists use Big-Oh notation an algorithm is O(N) if the number of operations required to solve a problem is proportional to the size of the problem sequential search on a list of N items requires roughly N checks (+ other constants)  O(N) for an O(N) algorithm, doubling the size of the problem requires double the amount of work (in the worst case)  if it takes 1 second to search a list of 1,000 items, then it takes 2 seconds to search a list of 2,000 items it takes 4 seconds to search a list of 4,000 items it takes 8 seconds to search a list of 8,000 items...

6 Searching an ordered list when the list is unordered, can't do any better than sequential search  but, if the list is ordered, a better alternative exists e.g., when looking up a word in the dictionary or name in the phone book  can take ordering knowledge into account  pick a spot – if too far in the list, then go backward; if not far enough, go forward binary search algorithm  check midpoint of the list  if desired item is found there, then DONE  if the item at midpoint comes after the desired item in the ordering scheme, then repeat the process on the left half  if the item at midpoint comes before the desired item in the ordering scheme, then repeat the process on the right half

7 Binary search /** * Performs binary search on a sorted list. items sorted list of Comparable items desired item to be searched for index where desired first occurs, -(insertion point)-1 if not found */ public static int binarySearch(List items, Comparable desired) { int left = 0;// initialize range where desired could be int right = items.length-1; while (left <= right) { int mid = (left+right)/2;// get midpoint value and compare int comparison = desired.compareTo(items[mid]); if (comparison == 0) {// if desired at midpoint, then DONE return mid; } else if (comparison < 0) {// if less than midpoint, focus on left half right = mid-1; } else {// otherwise, focus on right half left = mid + 1; } return /* CLASS EXERCISE */ ;// if reduce to empty range, NOT FOUND } the Collections utility class contains a binarySearch method  takes a List of Comparable items and the desired item ( List is an interface that specifies basic list operations, ArrayList implements List )  returns index of desired item if found, -(insertion_point) – 1 if not found

8 Visualizing binary search note: each check reduces the range in which the item can be found by half  see for demowww.creighton.edu/~davereed/csc107/search.html

9 How efficient is binary search? in the worst case:  the item you are looking for is in the first or last position of the list (or not found) start with N items in list after 1 st check, reduced to N/2 items to search after 2 nd check, reduced to N/4 items to search after 3 rd check, reduced to N/8 items to search... after log 2 N checks, reduced to 1 item to search again, the dominant factor in execution time is checking an item  the number of checks will determine efficiency in the average case? in the best case?

10 Big-Oh notation an algorithm is O(log N) if the number of operations required to solve a problem is proportional to the logarithm of the size of the problem binary search on a list of N items requires roughly log 2 N checks (+ other constants)  O(log N) for an O(log N) algorithm, doubling the size of the problem adds only a constant amount of work  if it takes 1 second to search a list of 1,000 items, then searching a list of 2,000 items will take time to check midpoint + 1 second searching a list of 4,000 items will take time for 2 checks + 1 second searching a list of 8,000 items will take time for 3 checks + 1 second...

11 Comparison: searching a phone book Number of entries in phone book Number of checks performed by sequential search Number of checks performed by binary search , ……… 10, , , ……… 1,000, to search a phone book of the United States (~280 million) using binary search? to search a phone book of the world (6 billion) using binary search?

12 Searching example for small N, the difference between O(N) and O(log N) may be negligible consider the following large-scale application: a spell checker  want to read in and store a dictionary of words  then, process a text file one word at a time if word is not found in dictionary, report as misspelled  since the dictionary file is large (117,777 words), the difference will be clear we can define a Dictionary class to store & access words  constructor, contains, size, displayAll, …  note: we can build a dictionary on top of an ArrayList of Strings ArrayList provides all the basic operations we require encapsulating into a Dictionary class allows us to hide later modifications

13 Dictionary class import java.util.ArrayList; import java.util.Scanner; import java.io.File; import java.io.FileNotFoundException; public class Dictionary { private ArrayList dict; public Dictionary() { // CONSTRUCTS EMPTY DICTIONARY } public Dictionary(String dictFile) { // CONSTRUCT DICTIONARY WITH WORDS FROM dictFile } public boolean contains(String word) { // RETURNS TRUE IF word IS STORED IN DICTIONARY } public int size() { // RETURNS NUMBER OF WORDS IN DICTIONARY } public void display() { // DISPLAYS ALL WORDS IN DICTIONARY, ONE PER LINE } implement the Dictionary class with the specified methods for now, utilize sequential search (e.g., contains ) download dictionary.txt dictionary.txt from the class directory

14 Spell checker import java.util.ArrayList; import java.util.Scanner; import java.io.File; import java.io.FileNotFoundException; import javax.swing.JOptionPane; public class SpellChecker { public static void main(String[] args) { String dictFile = JOptionPane.showInputDialog("Dictionary file:"); Dictionary dict = new Dictionary(dictFile); String textFile = JOptionPane.showInputDialog("Text file:"); try { System.out.println("CHECKING..."); Scanner infile = new Scanner(new File(textFile)); while (infile.hasNext()) { String word = strip(infile.next()); if (word.length() > 0 && !dict.contains(word)) { System.out.println(word); } infile.close(); System.out.println("...DONE"); } catch (FileNotFoundException exception) { System.out.println("NO SUCH FILE FOUND"); } private static String strip(String word) { // CLASS EXERCISE } stores words from a dictionary file processes a user- specified file, displaying each misspelled word note: need to strip non-letters and capitalization from file words before checking

15 In-class exercise copy the following files    add to the Dictionary project and spell check the Gettysburg address  is the delay noticeable? how about for Poe's The Cask of Amontillado ? now download and spell check the documents with that  largeDictionary.txt is more than twice as big (264,063 words)  does it take more than twice as long?

16 Dictionary w/ binary search since the dictionaries are in alphabetical order, we can use binary search  modify the Dictionary class to import the Collections utility class import java.util.Collections;  modify the contains method so that it uses Collections.binarySearch return (Collections.binarySearch(dict, word) >= 0); now spell check the documents using binary search  how much faster is it compared to sequential search  does largeDictionary.txt take more than twice as long?