Hashing CS 110: Data Structures and Algorithms First Semester, 2010-2011.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

CSCE 3400 Data Structures & Algorithm Analysis
Hashing as a Dictionary Implementation
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
© 2004 Goodrich, Tamassia Hash Tables1  
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing CS 3358 Data Structures.
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.
Maps, Dictionaries, Hashtables
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
hashing1 Hashing It’s not just for breakfast anymore!
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
CS 221 Analysis of Algorithms Data Structures Dictionaries, Hash Tables, Ordered Dictionary and Binary Search Trees.
Hashing CS 105. Hashing Slide 2 Hashing - Introduction In a dictionary, if it can be arranged such that the key is also the index to the array that stores.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing. Hash Tables - Introduction zA structure that offers fast insertion and searching zInsertion and searching is almost O(1) zHashing - a range of.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
Hash Tables1   © 2010 Goodrich, Tamassia.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
CS261 Data Structures Hash Tables Open Address Hashing.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
October 6, Algorithms and Data Structures Lecture VII Simonas Šaltenis Aalborg University
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
DATA STRUCTURE Presented By: Mahmoud Rafeek Alfarra Using C# MINISTRY OF EDUCATION & HIGHER EDUCATION COLLEGE OF SCIENCE AND TECHNOLOGY KHANYOUNIS- PALESTINE.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing. Search Given: Distinct keys k 1, k 2, …, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ), …, (k n, I n ) where I j is.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Hash table CSC317 We have elements with key and satellite data
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Advanced Associative Structures
Hash Tables.
CSCE 3110 Data Structures & Algorithm Analysis
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hash Maps Introduction
Presentation transcript:

Hashing CS 110: Data Structures and Algorithms First Semester,

Hashing ► In a dictionary, if it can be arranged such that the key is also the index to the array that stores the entries, searching and inserting items would be very fast ► Example: empdata[1000] index = employee ID number ► Search for employee with ID number 500 ► return empdata[500] ► Running Time: O(1)

Hash Table ► A data structure implemented as an array of objects, where the search keys correspond to the array indices ► Insert and find operations involve straight forward array accesses: O(1) time complexity

About Hash Tables ► In the first example shown, it was relatively easy since employee number is an integer ► A few problems may arise in different situations

About Hash Table ► Problem 1: possible integer key values might be too large; creating an appropriate array might be impractical ► Need to map large integer values to smaller array indices ► Problem 2: What if the key is a word in the English Alphabet (e.g. last names) ► Need to map names to integers (indices)

Large Values to Small Values ► Hash function: converts a number from a large range into a number from a smaller range (the range of array indices) ► Size of the array ► Rule of thumb: the array size should be about twice the size of the data set ► For 50,000 words, use an array of 100,000 elements

Hash Function and Modulo ► Simplest Hash Function: achieved by using the modulo function (returns the remainder) ► For example, 33 % 10 = 3 ► General Formula: LargeNumber % SmallRange

Hash Functions for Names ► Sum of Digits Method ► Map the alphabet A to Z to the numbers 1 to 26 (a=1, b=2, c=3, etc) ► Add the total of the letters ► For example, “cats” ► c=3, a=1, t=20, s=19, =43 ► “cats” will be stored using index 43 ► Use modulo to map to a smaller array

Collisions ► Problem ► Too many words with the same index ► “was”, “tin”, “give”, “tend”, “moan”, “tick” and several other words add to 43 ► These are called collisions: case where two different search keys hash to the same index value

Collisions ► Can occur even when dealing with integers ► Suppose the size of the hash table is 100 ► Keys 158 and 358 hash to the same value when using the modulo hash function

Collision Resolution Policy ► Need to know what to do when a collision occurs; i.e. during an insert operation; What if the array slot is already occupied? ► Most common policy: go to the next available slot ► “wrap around” the array if necessary

Collision Resolution Policy ► Consequence: when searching, use the hash function, first check whether the element is the one you are looking for ► If not, try the next slots ► How do you know if the element is not in the array?

Probe Sequence ► Sequence of indices that serve as array slots where a key value would map to ► The first index in the probe sequence is the home position; the value of the hash function ► The next indices are the alternative slots

Probe Sequence ► Suppose the array size is 10, and the hash function is h(K) = K%10. ► The probe sequence for K=25 is: ► 5,6,7,8,9,0,1,2,3,4 ► Here, we assume that most common collision resolution policy of going to the next slot: p(K,i) = I ► Goal: exhaust array slots

Hash Table Operations ► Insert object Obj with key value K ► home  h(K) for i  0 to M-1 do pos = (home + p(K,i)) % 10 if HT[pos].getKey() = K then throw exception “error” // or overwrite it else if HT[pos] is null then HT[pos]  Obj break;

Hash Table Operations ► Finding an object with key value K ► home  h(K) for i  0 to M-1 do pos = (home + p(K,i)) % 10 if HT[pos].getKey() = K then return HT[pos] else if HT[pos] is null then throw exception “not found”

Hash Table Operations ► Although insert and find run in O(1) time during typical conditions, the time complexity in the worst-case is O(n) ► Something to think about: characterize the worst-case scenarios for insert and find

Removing Elements ► Removing an element from a hash table during a delete operation poses a problem ► If we set the corresponding hash table entry to null, then succeeding find operations might not work properly ► Recall that for the find algorithm, seeing a null means a target element is not found but in fact the element might be in a next slot

Removing Elements ► Solution: tombstone ► Arrange it so that deleted entries seem null when inserting, but don’t seem null when searching ► Requires a simple flag on the objects stored

Hash Tables in Java ► java.util.Hashtable ► Important methods for Hashtable class ► put(Object key, Object entry) ► Object get(Object key) ► remove(Object key) ► boolean constainsKey(Object key)

Summary ► Hash tables implement the dictionary data structure and enable O(1) insert, find, and remove operations ► Caveat: O(n) in the worst-case because of the possibility of collisions

Summary ► Requires a hash function(maps keys to array indices) and a collision resolution policy ► Probe sequence depicts a sequence of array slots that an object would occupy, given its key ► In Java: use the Hashtable class