Lecture No.42 Data Structures Dr. Sohail Aslam.

Slides:



Advertisements
Similar presentations
Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley.
Advertisements

Hashing Techniques.
Hashing CS 3358 Data Structures.
Hashing. 2 Searching Consider the problem of searching an array for a given value –If the array is not sorted, the search requires O(n) time If the value.
Hashing. Searching Consider the problem of searching an array for a given value –If the array is not sorted, the search requires O(n) time If the value.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Hashing. 2 Preview A hash function is a function that: When applied to an Object, returns a number When applied to equal Objects, returns the same number.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
Hash Table March COP 3502, UCF.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Peacock Hash: Deterministic and Updatable Hashing for High Performance Networking Sailesh Kumar Jonathan Turner Patrick Crowley.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
ADSA: Hashing/ Advanced Data Structures and Algorithms Objectives – –introduce hashing, hash functions, hash tables, collisions, linear probing,
Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CSE 373 Data Structures and Algorithms Lecture 17: Hashing II.
CSCS-200 Data Structure and Algorithms Lecture
Hashing. Searching Consider the problem of searching an array for a given value If the array is not sorted, the search requires O(n) time If the value.
Hashing. 2 Preview A hash function is a function that: When applied to an Object, returns a number When applied to equal Objects, returns the same number.
Fundamental Structures of Computer Science II
Sections 10.5 – 10.6 Hashing.
CE 221 Data Structures and Algorithms
Hashing.
Hashing.
Data Structures Using C++ 2E
Tables and Dictionaries
Hashing.
Hashing Problem: store and retrieving an item using its key (for example, ID number, name) Linked List takes O(N) time Binary Search Tree take O(logN)
Hash Tables (Chapter 13) Part 2.
Hashing.
Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS.
Data Structures Using C++ 2E
Quadratic probing Double hashing Removal and open addressing Chaining
Design and Analysis of Algorithms
Hash Table.
Hashing.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Searching Tables Table: sequence of (key,information) pairs
CS202 - Fundamental Structures of Computer Science II
Hashing.
EE 312 Software Design and Implementation I
Hashing.
Algorithms: Design and Analysis
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Hashing.
Hashing.
Ch. 13 Hash Tables  .
Hash Maps Introduction
Hashing.
Data Structures and Algorithm Analysis Hashing
DATA STRUCTURES-COLLISION TECHNIQUES
EE 312 Software Design and Implementation I
Skip List: Implementation
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

Lecture No.42 Data Structures Dr. Sohail Aslam

Example Hash Functions If the keys are integers then key%T is generally a good hash function, unless the data has some undesirable features. For example, if T = 10 and all keys end in zeros, then key%T = 0 for all keys. In general, to avoid situations like this, T should be a prime number. End of lecture 41. Start of lecture 42.

Collision kiwi banana watermelon apple mango cantaloupe grapes Suppose our hash function gave us the following values: hash("apple") = 5 hash("watermelon") = 3 hash("grapes") = 8 hash("cantaloupe") = 7 hash("kiwi") = 0 hash("strawberry") = 9 hash("mango") = 6 hash("banana") = 2 kiwi banana watermelon apple mango cantaloupe grapes strawberry 1 2 3 4 5 6 7 8 9 hash("honeydew") = 6 • Now what?

Collision When two values hash to the same array location, this is called a collision Collisions are normally treated as “first come, first served”—the first value that hashes to the location gets it We have to find something to do with the second and subsequent values that hash to this same location.

Solution for Handling collisions Solution #1: Search from there for an empty location Can stop searching when we find the value or an empty location. Search must be wrap-around at the end.

Solution for Handling collisions Solution #2: Use a second hash function ...and a third, and a fourth, and a fifth, ...

Solution for Handling collisions Solution #3: Use the array location as the header of a linked list of values that hash to this location

Solution 1: Open Addressing This approach of handling collisions is called open addressing; it is also known as closed hashing. More formally, cells at h0(x), h1(x), h2(x), … are tried in succession where hi(x) = (hash(x) + f(i)) mod TableSize, with f(0) = 0. The function, f, is the collision resolution strategy.

Linear Probing We use f(i) = i, i.e., f is a linear function of i. Thus location(x) = (hash(x) + i) mod TableSize The collision resolution strategy is called linear probing because it scans the array sequentially (with wrap around) in search of an empty cell.

Linear Probing: insert Suppose we want to add seagull to this hash table Also suppose: hashCode(“seagull”) = 143 table[143] is not empty table[143] != seagull table[144] is not empty table[144] != seagull table[145] is empty Therefore, put seagull at location 145 robin sparrow hawk bluejay owl . . . 141 142 143 144 145 146 147 148 seagull

Linear Probing: insert Suppose you want to add hawk to this hash table Also suppose hashCode(“hawk”) = 143 table[143] is not empty table[143] != hawk table[144] is not empty table[144] == hawk hawk is already in the table, so do nothing. robin sparrow hawk seagull bluejay owl . . . 141 142 143 144 145 146 147 148

Linear Probing: insert Suppose: You want to add cardinal to this hash table hashCode(“cardinal”) = 147 The last location is 148 147 and 148 are occupied Solution: Treat the table as circular; after 148 comes 0 Hence, cardinal goes in location 0 (or 1, or 2, or ...) robin sparrow hawk seagull bluejay owl . . . 141 142 143 144 145 146 147 148

Linear Probing: find Suppose we want to find hawk in this hash table We proceed as follows: hashCode(“hawk”) = 143 table[143] is not empty table[143] != hawk table[144] is not empty table[144] == hawk (found!) We use the same procedure for looking things up in the table as we do for inserting them robin sparrow hawk seagull bluejay owl . . . 141 142 143 144 145 146 147 148

Linear Probing and Deletion If an item is placed in array[hash(key)+4], then the item just before it is deleted How will probe determine that the “hole” does not indicate the item is not in the array? Have three states for each location Occupied Empty (never used) Deleted (previously used)

Clustering One problem with linear probing technique is the tendency to form “clusters”. A cluster is a group of items not containing any open slots The bigger a cluster gets, the more likely it is that new values will hash into the cluster, and make it ever bigger. Clusters cause efficiency to degrade.

Quadratic Probing Quadratic probing uses different formula: Use F(i) = i2 to resolve collisions If hash function resolves to H and a search in cell H is inconclusive, try H + 12, H + 22, H + 32, … Probe array[hash(key)+12], then array[hash(key)+22], then array[hash(key)+32], and so on Virtually eliminates primary clusters

Collision resolution: chaining Each table position is a linked list Add the keys and entries anywhere in the list (front easiest) No need to change position! 4 10 123 key entry

Collision resolution: chaining Advantages over open addressing: Simpler insertion and removal Array size is not a limitation Disadvantage Memory overhead is large if entries are small. key entry key entry 4 key entry key entry 10 End of Lecture 42. key entry 123