Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

Hash Tables.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CSCE 3400 Data Structures & Algorithm Analysis
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing Techniques.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.
11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Tirgul 9 Hash Tables (continued) Reminder Examples.
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Tirgul 8 Hash Tables (continued) Reminder Examples.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Spring 2015 Lecture 6: Hash Tables
Data Structures Week 6 Further Data Structures The story so far  We understand the notion of an abstract data type.  Saw some fundamental operations.
Algorithm Course Dr. Aref Rashad February Algorithms Course..... Dr. Aref Rashad Part: 4 Search Algorithms.
Implementing Dictionaries Many applications require a dynamic set that supports dictionary-type operations such as Insert, Delete, and Search. E.g., a.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hash Tables1   © 2010 Goodrich, Tamassia.
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Introduction to Algorithms 6.046J/18.401J LECTURE7 Hashing I Direct-access tables Resolving collisions by chaining Choosing hash functions Open addressing.
COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
1 Hash Tables Chapter Motivation Many applications require only: –Insert –Search –Delete Examples –Symbol tables –Memory management mechanisms.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Sections 10.5 – 10.6 Hashing.
Hash table CSC317 We have elements with key and satellite data
Hashing CSE 2011 Winter July 2018.
Hashing Alexandra Stefan.
Hashing Alexandra Stefan.
Advanced Associative Structures
Hash Table.
Introduction to Algorithms 6.046J/18.401J
Resolving collisions: Open addressing
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hashing Alexandra Stefan.
Introduction to Algorithms
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hash Tables – 2 1.
Presentation transcript:

Hashtables

An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require an order relation to be defined an logarithmic time. Hashtables do not require an order relationship on the elements and all operations take O(1) time on average.

Direct Access Tables Assume that the keys are distinct numbers in the range U = {1,2,3….m}, use an array of size m and place the k th element in the k th index of the array. O(1) time for all operations Problem: wasteful for small sets and impractical if m is very large

Hashtables Main Idea: instead of using the keys themselves as index in the table, use a hash function for mapping keys to indices. Note U is the set representing all possible keys, it is therefore usually much larger than m.

Simple Uniform Hashing We assume that we use a hash function that given an key, will hash the key into any slot with equal probability. We will try to provide some reasonable hash functions later

hash functions The hash function is responsible to map keys into integers (slot numbers). A good hash function must have the following properties –1. Easy to evaluate - computing h(x) in O(1) –2. Uniform distribution over all the table slots –3. Similar keys will be mapped to different slots

hash functions The first step is to represent the key as a natural integer number. For example if S is a String then we can compute the interpret it as an integer value using the formula

Collisions Mapping keys to indices can cause collisions if to keys are mapped by the hash function to the same index Solutions –Chaining –Open addressing

Collision resolution - Chaining All keys that have the same hash value are placed in a linked list Insertion can be done at the beginning of the list in O(1) time Searching is proportional to the length of the list

Collision resolution by chaining Let h be a hash table of 9 slots and h(k) = k mod 9, insert the elements : 6, 43, 23, 62, 1, 13, 34, 55, 25 h(6) = 6 mod 9 = 6 h(43) = 43 mod 9 = 7 h(23) = 23 mod 9 = 5 h(62) = 62 mod 9 = 8 h(1) = 1 mod 9 = 1 h(13) = 13 mod 9 = 4 h(34) = 34 mod 9 = 7 h(55) = 55 mod 9 = 1 h(25) = 25 mod 9 = 7

Analysis The load factor of a hashtable is defined by the number of elements stored in the table divided by the number of slots An search will take under the assumption of uniform hashing

Division method An appropriate hash function for a hashtable that uses chaining is the division method. Powers of 10 and 2 should be avoided Good values are primes not close to powers of 2

Open Addressing Each element occupies a single slot in the hashtable. No chaining is done To insert an element, we probe the table according to the hash function until an empty slot is found. The hash function is now a function of both the key and the number of attempts in the insertion process

Hash Insert HashInsert (T,k) { int i; for (i = 0; i < m; i++) { j = h(k,i) if (T[j] == null) break; } if (i < m) T[j] = k else hashtable overflow }

Hash Search HashSearch (T,k) { int i; for (int i = 0; i < m; i++) { j = h(k,i) if (T[j] == null) return not found else if (T[j] ==k) return j }

Linear probing Using linear probing the hash function uses an ordinary hash function h’, such as a function using the division method, and turns it into: If a slot is occupied, we try the subsequent slot, etc., thus the initial slot determines the probing sequence for insertion and search.

Linear Probing Easy to implement but suffers from primary clustering. The probability of probing into a slot following an occupied slot is greater than the probability of any other slot.

Linear Probing Given a hash function h’, the linear probing scheme is simply

Exercise You are given a hash table h with 11 slots. Demonstrate inserting the following elements using linear probing and a hash function h(k) = k mod m –10,22,31,4,15,28,17,88,59

Solution h(10,0) = (10mod11 + 0) mod 11 = 10 h(22,0) = (22mod11 + 0) mod 11 = 0 h(31,0) = (31mod11 + 0) mod 11 = 9 h(4,0) = (4mod11 + 0) mod 11 = 4 h(15,0) = (15mod11 + 0) mod 11 = 4 h(15,1) = (15mod11 + 0) mod 11 = 5 h(28,0) = (28mod11 +1) mod 11 = 6 h(17,0) = (17mod11 + 0) mod 11 = 6 h(17,1) = (17mod11 + 1) mod 11 = h(88,0) = (88mod11 + 0) mod 11 = 10 h(88,1) = (88mod11 +1) mod 11 = 1 h(59,0) = (59mod11 + 0) mod 11 = 4 h(59,1) = (59mod11 + 1) mod 11 = 5 h(59,2) = (59mod11 + 2) mod 11 = 6 h(59,3) = (59mod11 + 3) mod 11 = 7 h(59,4) = (59mod11 + 4) mod 11 = 8

Quadric Probing Using quadratic probing the has function again uses an initial hash function h’, and is now Choosing a subsequent slot once a slot is full depends on the probe number i. Quadric probing involves a secondary form of clustering since only the initial probe determines the entire probing sequence,

Quadric Probing Given a hash function h’ quadric probing is done by:

Example You are given a hash table h with 11 slots. Demonstrate inserting the following elements using quadric probing and a hash function –10,22,31,4,15,28,17,88,59

h(10,0) = (10mod11 + 0) mod 11 = 10 h(22,0) = (22mod11 + 0) mod 11 = 0 h(31,0) = (31mod11 + 0) mod 11 = 9 h(4,0) = (4mod11 + 0) mod 11 = 4 h(15,0) = (15mod11 + 0) mod 11 = 4 h(15,1) = (15mod ) mod 11 = 8 h(28,0) = (28mod11 +1) mod 11 = 6 h(17,0) = (17mod11 + 0) mod 11 = 6 h(17,1) = (17mod ) mod 11 = 10 h(17,2) = (17mod ) mod 11 = 9 h(17,3) = (17mod ) mod 11 = 3 h(88,0) = (88mod11 + 0) mod 11 = 0 h(88,1) = (88mod ) mod 11 = 4 h(88,2) = (88mod ) mod 11 = 3 h(88,3) = (88mod ) mod 11 = 8 h(88,4) = (88mod ) mod 11 = 8 h(88,5) = (88mod ) mod 11 = 3 h(88,6) = (88mod ) mod 11 = 4 h(88,7) = (88mod ) mod 11 = 0 h(88,8) = (88mod ) mod 11 = 2 h(59,0) = (59mod11 + 0) mod 11 = 4 h(59,1) = (59mod ) mod 11 = 8 h(59,2) = (59mod ) mod 11 =

Double Hashing Given two hash functions Problem should not have any common divisors.

Double Hashing Example 1: select m to be a power of 2, and design to produce odd numbers. Example 2: select m to be prime, and m’ to be m-1.

Analysis In open addressing the load factor can not be more than 1. Insertion and unsuccessful searching requires at most attempts A successful search will take at most

Analysis When the table is 50% full, searching will require probes on average When the table is 90% full, searching will require probes on average

Problems with open addressing If an element is deleted, we can not simply remove the element, since later search operations may fail. Rehashing will ruin the running time Solution: Use a DELETED node.

Rehashing If we do not know the size of the elements in advance, we use a technique similar to the one used in vectors. Once the load factor reaches some predefined threshold, rehash the data into a larger hashtable.

Example Given a set S of unique integers and a number z, find such that x+y = z –An efficient worst case algorithm –An efficient average case algorithm

An efficient worst case algorithm 1. Sort all elements in S For every x in S we search for z-x (y) in S using binary search – Total of O(nlogn)

An efficient average case algorithm 1. We use a hash table where m is of order n for all we execute insert(x) 2. For all we execute search(z-x) Total - average case Total - worst case

Example Given a set S of sortable items, we are asked if all items in S are unique. 1. Sort the elements of S. 2. Iterate on the elements of S searching for subsequent equal values. Execution time

Example 1. Use a hash table were m is of order n. for all we execute insert(x). We modify the insert operation to signal if x already exists in the table. (every insert includes a search operation) Execution time - average case

Java hashcode Each java object has a method public int hashcode, which is defined in class Object, and is supported for the purposes of hashtables and hashmaps. The default implementation returns a unique number that is based on the memory location of the object. If two objects are equal they must have the same hashcode

Java hashcode It is not required that distinct objects will have distinct hashcodes, but it will improve the performance of the hashtables. Can the hashcode of an object change throughout it’s life cycle?