Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

Skip List & Hashing CSE, POSTECH.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
CS202 - Fundamental Structures of Computer Science II
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Hashing Techniques.
Dictionaries and Their Implementations
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Sets and Maps Chapter 9.
CE 221 Data Structures and Algorithms
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing.
Data Structures Using C++ 2E
CSCI 210 Data Structures and Algorithms
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Slides by Steve Armstrong LeTourneau University Longview, TX
Hash Tables (Chapter 13) Part 2.
Hashing Alexandra Stefan.
Data Structures Using C++ 2E
Hash functions Open addressing
Quadratic probing Double hashing Removal and open addressing Chaining
Advanced Associative Structures
Hash Table.
Chapter 28 Hashing.
Hash Tables.
Hashing as a Dictionary Implementation
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables and Associative Containers
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
Sets and Maps Chapter 9.
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Data Structures and Algorithm Analysis Hashing
Lecture-Hashing.
Presentation transcript:

Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved

Hashing Hashing Hash function Hash table Enables access to table items in time that is relatively constant and independent of the items Hash function Maps the search key of a table item into a location that will contain the item Hash table An array that contains the table items, as assigned by a hash function © 2011 Pearson Addison-Wesley. All rights reserved

Hashing A perfect hash function Collisions Maps each search key into a unique location of the hash table Possible if all the search keys are known Collisions Occur when the hash function maps more than one item into the same array location Collision-resolution schemes Assign locations in the hash table to items with different search keys when the items are involved in a collision Requirements for a hash function Be easy and fast to compute Place items evenly throughout the hash table © 2011 Pearson Addison-Wesley. All rights reserved

Hash Functions It is sufficient for hash functions to operate on integers Simple hash functions that operate on positive integers Selecting digits h(123654) = 14 Folding h(123654) = 1+2+3+6+5+4 Module arithmetic h(123654) = 123654 % SIZE Converting a character string to an integer If the search key is a character string, it can be converted into an integer before the hash function is applied © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Two approaches to collision resolution Approach 1: Open addressing A category of collision resolution schemes that probe for an empty, or open, location in the hash table The sequence of locations that are examined is the probe sequence Linear probing Searches the hash table sequentially, starting from the original location specified by the hash function Possible problem Primary clustering © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Example of linear probing (and linear clustering) © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Approach 1: Open addressing (Continued) Quadratic probing Searches the hash table beginning with the original location that the hash function specifies and continues at increments of 12, 22, 32, and so on Possible problem Secondary clustering Double hashing Uses two hash functions Searches the hash table starting from the location that one hash function determines and considers every nth location, where n is determined from a second hash function Increasing the size of the hash table The hash function must be applied to every item in the old hash table before the item is placed into the new hash table © 2011 Pearson Addison-Wesley. All rights reserved

Rehashing Double the size of the hash table when a load factor is reached and rehash all old values in the new hash table © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Example of quadratic probing © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Example of double hashing with: h1(x) = x % 7 ; h3(x)= h1(x) + i*h2(x) % SIZE © 2011 Pearson Addison-Wesley. All rights reserved

Resolving Collisions Approach 2: Restructuring the hash table Buckets Each location in the hash table is itself an array called a bucket (problem !!) Separate chaining Each hash table location is a linked list © 2011 Pearson Addison-Wesley. All rights reserved

The Efficiency of Hashing An analysis of the average-case efficiency of hashing involves the load factor Load factor  Ratio of the current number of items in the table to the maximum size of the array table Measures how full a hash table is Should not exceed 2/3 Hashing efficiency for a particular search also depends on whether the search is successful Unsuccessful searches generally require more time than successful searches © 2011 Pearson Addison-Wesley. All rights reserved

The Efficiency of Hashing Linear probing Successful search: 0.5 * [1 + 1/(1-)] Unsuccessful search: 0.5 * [1 + 1/(1- )2] Quadratic probing and double hashing Successful search: -loge(1- )/  Unsuccessful search: 1/(1- ) Separate chaining Insertion is O(1) Retrievals and deletions Successful search: 1 + (/2) Unsuccessful search:  © 2011 Pearson Addison-Wesley. All rights reserved

The Efficiency of Hashing Figure 13-50 The relative efficiency of four collision-resolution methods © 2011 Pearson Addison-Wesley. All rights reserved

What Constitutes a Good Hash Function? A good hash function should Be easy and fast to compute Scatter the data evenly throughout the hash table General requirements of a hash function The calculation of the hash function should involve the entire search key. If a hash function uses module arithmetic, the base (table size) should be prime. © 2011 Pearson Addison-Wesley. All rights reserved

Table Traversal: An Inefficient Operation Under Hashing Hashing as an implementation of the ADT table For many applications, hashing provides the most efficient implementation Hashing is not efficient for Traversal in sorted order Finding the item with the smallest or largest value in its search key Range query © 2011 Pearson Addison-Wesley. All rights reserved

The JCF Hashtable and TreeMap Classes JFC Hashtable implements a hash table Maps keys to values Large collection of methods © 2011 Pearson Addison-Wesley. All rights reserved

Summary Hashing as a table implementation calculates where the data item should be rather than search for it A hash function should be easy to compute and should scatter the search keys evenly throughout the hash table A collision occurs when two different search keys hash into the same array location Hashing does not efficiently support operations that require the table items to be ordered Several independent organizations can be imposed on a given set of data © 2011 Pearson Addison-Wesley. All rights reserved