Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.

Slides:



Advertisements
Similar presentations
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Advertisements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
An Introduction to Hashing. By: Sara Kennedy Presented: November 1, 2002.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing as a Dictionary Implementation
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Hash Tables Gordon College CS Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Hashing General idea: Get a large array
Aree Teeraparbseree, Ph.D
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hash Table March COP 3502, UCF.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Storage and Retrieval Structures by Ron Peterson.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
The Map ADT and Hash Tables. 2 The Map ADT  Map: An abstract data type where a value is "mapped" to a unique key  Need a key and a value to insert new.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
CSC 427: Data Structures and Algorithm Analysis
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
LECTURE 35: COLLISIONS CSC 212 – Data Structures.
CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
DATA STRUCTURE Presented By: Mahmoud Rafeek Alfarra Using C# MINISTRY OF EDUCATION & HIGHER EDUCATION COLLEGE OF SCIENCE AND TECHNOLOGY KHANYOUNIS- PALESTINE.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Searching Tables Table: sequence of (key,information) pairs (key,information) pair is a record key uniquely identifies information, so no duplicate records.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
Implementing the Map ADT.  The Map ADT  Implementation with Java Generics  A Hash Function  translation of a string key into an integer  Consider.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hashing.
Slides by Steve Armstrong LeTourneau University Longview, TX
Searching Tables Table: sequence of (key,information) pairs
CS202 - Fundamental Structures of Computer Science II
EE 312 Software Design and Implementation I
Collision Handling Collisions occur when different elements are mapped to the same cell.
Data Structures and Algorithm Analysis Hashing
EE 312 Software Design and Implementation I
CSE 373: Data Structures and Algorithms
Presentation transcript:

Hash Tables © Rick Mercer

 Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing a hash table  linear probing  quadratic probing  separate chaining hashing Hash Tables A "faster" implementation of a Map

Data StructureputgetDelete Unsorted Array Sorted Array Unsorted Linked List Sorted Linked List Binary Search Tree Big Oh Complexity for various Map Implementations

Hash Tables  Hash table: another structure for storing data  Provides virtually direct access to objects based on a key (a unique String or Integer)  key could be your SID, your telephone number, social security number, account number, …  keys must be unique  Each key is associated with (mapped to) an object

Hashing  Must convert keys such as " " into an integer index from 0 to some reasonable size  Elements can be found, inserted, and removed using the integer index as an array index  Insert (called put), find (get), and remove must use the same "address calculator"  which we call the Hash function  Good structure for implementing a dictionary

Hashing  Can make a String or Integer object into a key by "hashing" the key to get an int  Ideally, every key has a unique hash value  Then the hash value could be used as an array index, however,  you cannot rely on every key "hashing" to a unique integer  but can usually get close enough  Still n eed a way to handle "collisions" "abc" may hash to the same int as "cba" –if a lousy hash function is used

Hash Tables: Runtime Efficient  Lookup time does not grow when n increases  A hash table supports  fast retrieval O(1)  fast deletion O(1)  fast insertion O(1)  Could use String keys each ASCII character equals some unique integer  "able" = == 404

Hash method works something like this zzzzzzzz Domain: "AAAAAAAA".. "zzzzzzzz"Range: hash(key) AAAAAAAA 8482 hash(key) 1273 Convert a String key into an integer that will be in the range of 0 through the maximum capacity-1 Assume the array capacity is 9997

Hash method  What if the ASCII value of individual chars of the string key added up to a number from ("A") 65 to possibly 488 ("zzzz") 4 chars max  If the array has size = 309, mod the sum "abba": 390 % TABLE_SIZE = 81 "abcd": 394 % TABLE_SIZE = 85 "able": 404 % TABLE_SIZE = 95  These array indices store these keys abba abcd able

A terrible hash public void testHash() { assertEquals(81, hash("abba")); assertEquals(81, hash("baab")); assertEquals(85, hash("abcd")); assertEquals(86, hash("abce")); assertEquals(308, hash("IKLT")); assertEquals(308, hash("KLMP")); } private final int TABLE_SIZE = 309; public int hash(String key) { // return an int in the range of 0..TABLE_SIZE-1 int result = 0; int n = key.length(); for (int j = 0; j < n; j++) result += key.charAt(j); // add up the characters return result % TABLE_SIZE; }

Collisions  A good hash method  executes quickly  distributes keys equitably  But you still have to handle collisions when two keys have the same hash value  the hash method is not guaranteed to return a unique integer for each key  example: simple hash method with "baab" and "abba"  There are several ways to handle collisions  let us first examine linear probing

Linear Probing Dealing with Collisions  Collision : When an element to be inserted hashes out to be stored in an array position that is already occupied.  Linear probing : search sequentially for an unoccupied position use wraparound

A hash table after three insertions using the too simple hash code method "abba" Keys insert objects with these three keys: "abba" "abcd" "abce"... "abcd" "abce"

Collision occurs while inserting "baab" can't insert "baab" where it hashes to same slot as "abba" Linear probe forward by 1, inserting it at the next available slot "baab" Try [81] Put in [82] "abba" "abcd" "abce" "baab"

Wrap around when collision occurs at end Insert "KLMP" "IKLT" both of which have a hash value of "abba" "abcd" "abce" "baab" "KLMP" "IKLT"

Find object with key "baab" "abba" "abcd" "abce" "baab" "KLMP" "IKLT" "baab" still hashes to 81, but since [81] does not hold it, linear probe to [82] At this point, you could return a reference to it or remove it

Find and Remove an element  Follow the same path to find an item  If linear search finds an empty hash table slot, the item could not have been found the search is done  To remove an element, follow the same path  If found, mark the element deleted somehow  Three possible states when looking at slots  the slot was never occupied (can use in put or may denote the search is over)  the slot is occupied if matches stop -- or proceed to next  the slot was occupied, but nothing there now removed  We could call this a tombStoned slot

Linear Probe Implementation  Could have a linear probing, array based, implementation  Each array element references a HashNode with some boolean instance variables to indicate which of the three states it is in active, avail, or TombStoned -- to allow linear probes past removed elements private class HashTableNode { private String key; Object data; private boolean active, tombstone; public HashTableNode() { // All nodes in array initialized this way key = null; data = null; active = false; tombstone = false; }

 Used slots tend to cluster with linear probing Array based implementation has Clustering Problem

Quadratic Probing  Quadratic probing eliminates the primary clustering problem  Assume hVal is the value of the hash function  Instead of linear probing which searches for an open slot in a linear fashion like this hVal + 1, hVal + 2, hVal + 3, hVal + 4,...  add index values in increments of hVal + 1 2, hVal + 2 2, hVal + 3 2, hVal + 4 2,...

Does it work?  Quadratic probing works if  the table size is prime  studies show the prime numbered table size removes some of the non-randomness of hash functions  and the table is never more than half full  probes 1, 4, 9, 16, 32, 64, 128,... slots away  So make your table twice as big as you need  insert, find, remove are O(1)

Separate Chaining Hashing  Separate Chaining Hashing is an alternative to probing  Maintain an array of linked lists  Hash to the same place always and insert at the beginning (or end) of the linked list.  The linked list needs add and remove methods

 An array of linked lists An Array of LinkedList Objects Implementation