JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Data Structures Using C++ 2E
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing Techniques.
Algorithmic Complexity Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
CS 206 Introduction to Computer Science II 09 / 10 / 2008 Instructor: Michael Eckmann.
Introduction to Analysis of Algorithms
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Previous Lecture Revision Previous Lecture Revision Hashing Searching : –The Main purpose of computer is to store & retrieve –Locating for a record is.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
Cmpt-225 Algorithm Efficiency.
CS 206 Introduction to Computer Science II 01 / 28 / 2009 Instructor: Michael Eckmann.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Hash Table March COP 3502, UCF.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved ADT Implementation:
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
IT 60101: Lecture #151 Foundation of Computing Systems Lecture 15 Searching Algorithms.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
 DATA STRUCTURE DATA STRUCTURE  DATA STRUCTURE OPERATIONS DATA STRUCTURE OPERATIONS  BIG-O NOTATION BIG-O NOTATION  TYPES OF DATA STRUCTURE TYPES.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
1 Algorithms  Algorithms are simply a list of steps required to solve some particular problem  They are designed as abstractions of processes carried.
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
CS 206 Introduction to Computer Science II 09 / 18 / 2009 Instructor: Michael Eckmann.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
CS 206 Introduction to Computer Science II 01 / 30 / 2009 Instructor: Michael Eckmann.
ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
1 Data Structures CSCI 132, Spring 2014 Lecture 34 Analyzing Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Week 12 - Friday.  What did we talk about last time?  Finished hunters and prey  Class variables  Constants  Class constants  Started Big Oh notation.
CS 150: Analysis of Algorithms. Goals for this Unit Begin a focus on data structures and algorithms Understand the nature of the performance of algorithms.
Searching Topics Sequential Search Binary Search.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Algorithm Analysis with Big Oh ©Rick Mercer. Two Searching Algorithms  Objectives  Analyze the efficiency of algorithms  Analyze two classic algorithms.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
May 3rd – Hashing & Graphs
Big-O notation.
Cse 373 April 24th – Hashing.
structures and their relationships." - Linus Torvalds
Hash Tables.
Advanced Implementation of Tables
Advanced Implementation of Tables
structures and their relationships." - Linus Torvalds
Lecture-Hashing.
Presentation transcript:

JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables

Today’s buzzwords Algorithm  A strategy to solve a problem  A systematic approach that describes the solution process Complexity and Efficiency  How does an algorithm scale when the input size grows?  How do two algorithms solving the same problem compare? Big Oh Notation  A theoretical measure for the execution of an algorithm, in terms of the problem size n – usually the number of items. Hashing function  A function that takes an object and generates a number (or some form of an address) to a location where the object should be placed Hash Table  A data structure that stores items in designated places using hashing functions for speeding up key-based search

Algorithms Every problem can be solved if given enough computing power But it doesn’t hurt to use as little as you can Problems do not have to be boring crunching numbers stuff Solutions are almost always elegant A good algorithm uses less computing resource (less memory, less CPU time)  Strangely, may not always result in more lines of code!

So – lets solve a fun problem! How would you take the mouse to the cheese?

An algorithm to solve this problem: 1. First, read in the maze and populate a 2D array of Rooms 2. Next, create an instance of a stack. 3. Place the Room/Location corresponding to the mouse position in the stack. 4. Now use the following strategy. Peek at the top room of the stack. If this room is not visited, do the following: 1. Mark it to be visited. If the room location is the same as the cheese location, then you are done. If not, find all the rooms that are accessible from this top room that are not visited (you can use a specific order, or random order - it does not matter). 2. Push all the neighboring un-visited rooms in the stack. 5. If the top room is already visited, pop it off the stack. Continue with Step If the stack becomes empty, guess what? There is no solution to the problem.

A data structure to solve these problems: Game trees Start goal

Algorithms to solve the problems Easiest: Depth First search  Simplest, guaranteed to find a solution  Easily implemented with a stack as we saw  Inefficient – lot of backtracking, will potentially visit all nodes! Pruning trees  Each node is given a weight based on how close it is to goal  Idea is to pick a node with the highest weight  Creating the tree is more difficult  May not always find the “best” solution

Complexity and Efficiency of Algorithms Complexity is not:  How hard the algorithm is to implement  How many lines of code it takes to implement it Complexity is:  How many operations the algorithm does to solve the problem  How much resources it takes to solve the problem A program is more complex (less efficient) if:  It performs more operations  Takes more time and memory  Scales worse as input size grows

How is Efficiency Measured? Big Oh Notation:  A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items.  Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n).  The notation is read, "f of n is big oh of g of n".

So what does that mean? Two algorithms with the same efficiency in Big Oh notation will scale the same way There may be a slight difference in actual running times. For example, Bubblesort and Insertionsort are both O(n2), but Insertionsort on a given size input typically is faster However, as the input size grows, both algorithms show the same increase in runtime Provides a means to compare algorithms

Hash Table – the most efficient data structure for Collections Start with an array that holds the hash table. Use a hash function to take a key and map it to some index in the array. This function will generally map several different keys to the same index. If the desired record is in the location given by the index, then we’re finished, otherwise we must use some method to resolve the collision that may have occurred between two records wanting to go to the same location. This process is called hashing. To use hashing we must  find good hash functions  determine how to resolve collisions

Collision Resolution with Open Addressing Linear Probing: Linear probing starts with the hash address and searches sequentially for the target key or an empty position. The array should be considered circular, so that when the last location is reached, the search proceeds to the first location of the array. a b c d e f Clustering:

Collision Resolution with Open Addressing (Contd.) Quadratic Probing: If there is a collision at hash address h, quadratic probing goes to location h+1, h+4, h+9,…, that is, at locations h+ i 2 for i=1,2,... Other Methods:  Key-dependent increments;  Random probing

Chained Hash Tables

Birthday Surprise: How Collisions are Possible? If 24 or more randomly chosen people are in a room, what is the probability that two people in the class have the same birthday? For hashing, the birthday surprise says that for any problem of reasonable size, collisions will almost certainly occur.

So – how efficient are hash tables? Ordered Arrays Binary Trees Hash Tables FindO(log n) O(1) InsertO(n)O(log n)O(1) DeleteO(n)O(log n)O(1)

Moral of the Story Data Structures provide the vehicle for problem solving The algorithm is the route from the source to the destination Efficiency is the time you take to make the trip!  An interstate is more “efficient” than a state route