CSE 373 Data Structures and Algorithms Lecture 17: Hashing II.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
Implementation of Linear Probing (continued) Helping method for locating index: private int findIndex(long key) // return -1 if the item with key 'key'
Hashing Techniques.
Hashing CS 3358 Data Structures.
CSE 373 Data Structures Lecture 10
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
§3 Separate Chaining ---- keep a list of all keys that hash to the same value struct ListNode; typedef struct ListNode *Position; struct HashTbl; typedef.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hashing Dr. Yingwu Zhu.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
CSE373: Data Structures & Algorithms Lecture 17: Hash Collisions Kevin Quinn Fall 2015.
Hash Tables - Motivation
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Fundamental Structures of Computer Science II
CE 221 Data Structures and Algorithms
Data Structures Using C++ 2E
Hashing Problem: store and retrieving an item using its key (for example, ID number, name) Linked List takes O(N) time Binary Search Tree take O(logN)
CSCI 210 Data Structures and Algorithms
Open Addressing: Quadratic Probing
Hashing CSE 2011 Winter July 2018.
Hashing - resolving collisions
Hash Tables (Chapter 13) Part 2.
Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS.
Data Structures Using C++ 2E
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Data Structures and Algorithms
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CSE 326: Data Structures Hashing
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Tree traversal preorder, postorder: applies to any kind of tree
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Data Structures and Algorithm Analysis Hashing
CSE 373: Data Structures and Algorithms
Lecture No.42 Data Structures Dr. Sohail Aslam.
Presentation transcript:

CSE 373 Data Structures and Algorithms Lecture 17: Hashing II

Hash versus tree 2  Which is better, a hash set or a tree set? HashTree

Implementing Set ADT (Revisited) 3 InsertRemoveSearch Unsorted array O(1)O(n)O(n)O(n)O(n) Sorted arrayO(log n + n) O(log n) Linked listO(1)O(n)O(n)O(n)O(n) BST (if balanced) O(log n) Hash tableO(1)

Probing hash tables 4  Alternative strategy for collision resolution: try alternative cells until empty cell found  cells h 0 (x), h 1 (x), h 2 (x),... tried in succession, where:  h i (x) = (hash(x) + f(i)) % TableSize  f is collision resolution strategy  Because all data goes in table, bigger table needed

Linear probing 5  linear probing: resolve collisions in slot i by putting colliding element into next available slot (i+1, i+2,...)  Pseudocode for insert: first probe = h(value) while (table[probe] occupied) probe = (probe + 1) % TableSize table[probe] = value  add 41, 34, 7, 18, then 21, then 57  lookup/search algorithm modified - have to loop until we find the element or an empty slot  What happens when the table gets mostly full?

Linear probing 6  f(i) = i  Probe sequence: 0 th probe = h(x) mod TableSize 1 th probe = (h(x) + 1) mod TableSize 2 th probe = (h(x) + 2) mod TableSize... i th probe = (h(x) + i) mod TableSize

Deletion in Linear Probing 7  To delete 18, first search for 18  18 found in bucket 8  What happens if we set bucket 8 to null?  What will happen when we search for 57?

Deletion in Linear Probing (2) 8  Instead of setting bucket 8 to null, place a special marker there  When lookup encounters marker, it ignores it and continues search  What should insert do if it encounters marker?  Too many markers degrades performance – rehash if there are too many X 957

Primary clustering problem 9  clustering: nodes being placed close together by probing, which degrades hash table's performance  add 89, 18, 49, 58, 9  now searching for the value 28 will have to check half the hash table! no longer constant time...

Linear probing – clustering 10 no collision collision in small cluster collision in large cluster

Alternative probing strategy 11  Primary clustering occurs with linear probing because the same linear pattern:  if a slot is inside a cluster, then the next slot must either:  also be in that cluster, or  expand the cluster  Instead of searching forward in a linear fashion, consider searching forward using a quadratic function

Quadratic probing 12  quadratic probing: resolving collisions on slot i by putting the colliding element into slot i+1, i+4, i+9, i+16,...  add 89, 18, 49, 58, 9  49 collides (89 is already there), so we search ahead by +1 to empty slot 0  58 collides (18 is already there), so we search ahead by +1 to occupied slot 9, then +4 to empty slot 2  9 collides (89 is already there), so we search ahead by +1 to occupied slot 0, then +4 to empty slot 3  What is the lookup algorithm?

Quadratic probing in action 13

Quadratic probing 14  f(i) = i 2  Probe sequence: 0 th probe = h(x) mod TableSize 1 th probe = (h(x) + 1) mod TableSize 2 th probe = (h(x) + 4) mod TableSize 3 th probe = (h(x) + 9) mod TableSize... i th probe = (h(x) + i 2 ) mod TableSize

Quadratic probing benefit 15  If one of h + i 2 falls into a cluster, this does not imply the next one will  For example, suppose an element was to be inserted in bucket 23 in a hash table with 31 buckets  The sequence in which the buckets would be checked is: 23, 24, 27, 1, 8, 17, 28, 10, 25, 11, 30, 20, 12, 6, 2, 0

Quadratic probing benefit 16  Even if two buckets are initially close, the sequence in which subsequent buckets are checked varies greatly  Again, with TableSize = 31, compare the first 16 buckets which are checked starting with elements 22 and 23: 22 22, 23, 26, 0, 7, 16, 27, 9, 24, 10, 29, 19, 11, 5, 1, , 24, 27, 1, 8, 17, 28, 10, 25, 11, 30, 20, 12, 6, 2, 0  Quadratic probing solves the problem of primary clustering

Quadratic probing drawbacks 17  Suppose we have 8 buckets: 1 2 % 8 = 1, 2 2 % 8 = 4, 3 2 % 8 = 1  In this case, we are checking bucket h(x) + 1 twice having checked only one other bucket  No guarantee that (h(x) + i 2 ) % TableSize will cycle through 0, 1,..., TableSize – 1

Quadratic probing 18  Solution:  require that TableSize be prime  (h(x) + i 2 ) % TableSize for i = 0,..., (TableSize – 1)/2 will cycle through (TableSize + 1)/2 values before repeating  Example with TableSize = 11: 0, 1, 4, 9, 16 ≡ 5, 25 ≡ 3, 36 ≡ 3  With TableSize = 13: 0, 1, 4, 9, 16 ≡ 3, 25 ≡ 12, 36 ≡ 10, 49 ≡ 10  With TableSize = 17: 0, 1, 4, 9, 16, 25 ≡ 8, 36 ≡ 2, 49 ≡ 15, 64 ≡ 13, 81 ≡ 13 Note: the symbol ≡ means "% TableSize"

Hashing practice problem 19  Draw a diagram of the state of a hash table of size 10, initially empty, after adding the following elements.  h(x) = x mod 10 as the hash function.  Assume that the hash table uses linear probing. 7, 84, 31, 57, 44, 19, 27, 14, and 64  Repeat the problem above using quadratic probing.

Double hashing 20  double hashing: resolve collisions on slot i by applying a second hash function  f(i) = i * g(x) where g is a second hash function  limitations on what g can evaluate to?  recommended: g(x) = R – (x % R), where R prime smaller than TableSize  Psuedocode for double hashing: if (table is full) error probe = h(value) offset = g(value) while (table[probe] occupied) probe = (probe + offset) % TableSize table[probe] = value

Double Hashing Example h(x) = x % 7 and g(x) = 5 – (x % 5) Probes

Double hashing 22  f(i) = i * g(x)  Probe sequence: 0 th probe = h(x) % TableSize 1 th probe = (h(x) + g(x)) % TableSize 2 th probe = (h(x) + 2*g(x)) % TableSize 3 th probe = (h(x) + 3*g(x)) % TableSize... i th probe = (h(x) + i*g(x)) % TableSize