Hashing CSE 2011 Winter 2011 5 July 2018.

Slides:



Advertisements
Similar presentations
© 2004 Goodrich, Tamassia Hash Tables
Advertisements

Hashing as a Dictionary Implementation
© 2004 Goodrich, Tamassia Hash Tables1  
Maps. Hash Tables. Dictionaries. 2 CPSC 3200 University of Tennessee at Chattanooga – Summer 2013 © 2010 Goodrich, Tamassia.
Data Structures Lecture 12 Fang Yu Department of Management Information Systems National Chengchi University Fall 2010.
Log Files. O(n) Data Structure Exercises 16.1.
Hashing Techniques.
Dictionaries and Hash Tables1  
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Dictionaries 4/17/2017 3:23 PM Hash Tables  
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hash Tables1   © 2010 Goodrich, Tamassia.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
CS201: Data Structures and Discrete Mathematics I Hash Table.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
Hash Maps Rem Collier Room A1.02 School of Computer Science and Informatics University College Dublin, Ireland.
Algorithms Design Fall 2016 Week 6 Hash Collusion Algorithms and Binary Search Trees.
Hash Tables 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
CSCI 210 Data Structures and Algorithms
Slides by Steve Armstrong LeTourneau University Longview, TX
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
© 2013 Goodrich, Tamassia, Goldwasser
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Searching.
Data Structures and Database Applications Hashing and Hashtables in C#
Advanced Associative Structures
Hash Table.
Hash Table.
Chapter 28 Hashing.
Data Structures Maps and Hash.
Data Structures and Database Applications Hashing and Hashtables in C#
Hash Tables 11/22/2018 3:15 AM Hash Tables  1 2  3  4
Dictionaries 11/23/2018 5:34 PM Hash Tables   Hash Tables.
Dictionaries and Hash Tables
Chapter 21 Hashing: Implementing Dictionaries and Sets
Copyright © Aiman Hanna All rights reserved
Hash Tables   Maps Dictionaries 12/7/2018 5:58 AM Hash Tables  
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries and Hash Tables
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Computer Science and Engineering
Hash Tables Computer Science and Engineering
Hash Tables Computer Science and Engineering
EE 312 Software Design and Implementation I
Dictionaries 4/5/2019 1:49 AM Hash Tables  
CS210- Lecture 17 July 12, 2005 Agenda Collision Handling
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Dictionaries and Hash Tables
CSE 373: Data Structures and Algorithms
Presentation transcript:

Hashing CSE 2011 Winter 2011 5 July 2018

Hashing BST, AVL trees: O(logN) for insertion, deletions and searches. Hashing is a technique used for performing insertion, deletions and searches in constant average time. Finding min, finding max, printing the whole collection in sorted order in linear time are not supported. A hash table data structure consists of: Hash function h Array of size N (bucket array)

Example We design a hash table for a dictionary storing items (SIN, Name), where SIN (social insurance number) is a ten-digit positive integer Our hash table uses an array of size N = 10,000 and the hash function h(x) = x mod N We use chaining to handle collisions Assuming integer keys, how do we map keys to hash table entries?  1 025-612-0001 2  3  4 451-229-0004 981-101-0004 … 9997  9998 200-751-9998 9999 

Hash Functions and Hash Tables A hash function h maps keys of a given type to integers in a fixed interval [0, N - 1] Example: h(x) = x mod N is a hash function for integer keys The integer h(x) is called the hash value of key x The goal of a hash function is to uniformly disperse keys in the range [0, N - 1] A hash table for a given key type consists of Hash function h Array of size N A collision occurs when two keys in the dictionary have the same hash value. Collision handing schemes: Chaining: colliding items are stored in a sequence Open addressing: the colliding item is placed in a different cell of the table

Design Issues Hash function Collision handling For integer keys (compression functions) For strings Collision handling Separate chaining Probing (open addressing) Linear probing Quadratic probing Double hashing Table size (should be a prime number)

Hash Functions Division: Multiply, Add and Divide (MAD): h2 (y) = y mod N The size N of the hash table is usually chosen to be a prime number to minimize the number of collisions The reason has to do with number theory and is beyond the scope of this course Multiply, Add and Divide (MAD): h2 (y) = (ay + b) mod N a and b are nonnegative integers such that a mod N  0 Otherwise, every integer would map to the same value b

Collision Handling Collisions occur when different elements are mapped to the same cell Separate Chaining: let each cell in the table point to a linked list of entries that map there  1 2 3 4 451-229-0004 981-101-0004 025-612-0001 Separate chaining is simple, but requires additional memory outside the table

Separate Chaining Use chaining to set up lists of items with same index The expected search/insertion/removal time is O(n/N), provided that the indices are uniformly distributed N = hash table size n = number of elements in the table If n = O(N), the expected running time is O(1)

Load Factor – Separate Chaining Define the load factor  = n/N n = number of elements in the hash table N = hash table size (prime number) To obtain best performance with separate chaining, ensure   1. As we add more elements to the hash table,  goes up  rehashing (allocate a bigger table, define a new hash function, and copy the elements to the new array).

Collision Handling Separate chaining Probing (open addressing) Linear probing Quadratic probing Double hashing

Linear Probing Linear probing handles collisions by placing the colliding item in the next (circularly) available table cell Each table cell inspected is referred to as a “probe” Colliding items lump together; future collisions will cause a longer sequence of probes Example: h(x) = x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order Remove 44, 32, 73, 31 1 2 3 4 5 6 7 8 9 10 11 12 41 18 44 59 32 22 31 73 1 2 3 4 5 6 7 8 9 10 11 12

Linear Probing Example 31 73 44 32 41 18 44 59 32 22 31 73

Search with Linear Probing Consider a hash table A that uses linear probing get(k) We start at cell h(k) We probe consecutive locations until one of the following occurs An item with key k is found, or An empty cell (null) is found, or N cells have been unsuccessfully probed Algorithm get(k) i  h(k) p  0 repeat c  A[i] if c ==  return NULL else if c.key () = k return c.element() else i  (i + 1) mod N p  p + 1 until p = N

Removal and Insertion with Probing remove(k) Call get(k) to get the element. Should we set the now empty cell to NULL? No. It would mess up the search procedure. See example on the next slide. Return the element. A cell has three states: null: brand new, never used. get(x) stops when a null cell is reached. in use: currently used. available: previously used, now available but unused. get(x) continues the search when encountering an available cell. Example of available cells: key has value -1.

Example with remove(k) get(31) 31 73 44 32 41 18 44 59 32 22 31 73

Linear Probing: Removal and Insertion To handle insertions and deletions, we marked the deleted cells as “available” instead of null. remove(k) We search for a cell with key k If such an item is found, we mark the cell as “available” and we return the element. Else, we return NULL put(k, e) If table is not full, we start at cell h(k). If this cell is occupied: We probe consecutive cells until a cell i is found that is either null or marked as “available”. We store item (k, e) in cell I Java code: 9.2.6

Load Factor – Linear Probing Define the load factor  = n/N n = number of elements in the hash table N = hash table size (prime number) To obtain best performance with linear probing, ensure that   0.5. As we add more elements to the hash table,  goes up  rehashing (allocate a bigger table, define a new hash function, and copy the elements to the new array).

Next time … Probing (open addressing) Rehashing Linear probing Quadratic probing Double hashing Rehashing Hash functions for strings For a brief comparison of hash tables and self-balancing binary search trees (such as AVL trees), see http://en.wikipedia.org/wiki/Associative_array#Efficient_representations