1 Hashing Techniques: Implementation Implementing Hash Functions Implementing Hash Tables Implementing Chained Hash Tables Implementing Open Hash Tables.

Slides:



Advertisements
Similar presentations
CSCE 3400 Data Structures & Algorithm Analysis
Advertisements

Theory I Algorithm Design and Analysis (5 Hashing) Prof. Th. Ottmann.
Skip List & Hashing CSE, POSTECH.
Hashing as a Dictionary Implementation
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Hashing: Collision Resolution Schemes
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Introduction to Design Patterns What is Design Pattern? The Container Pattern. The Visitor Pattern. The SearchableContainer Pattern. The Enumeration Pattern.
Theory I Algorithm Design and Analysis (7 Hashing: Open Addressing)
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
Hashing Techniques.
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
Dictionaries and Hash Tables1  
8-1 ΜΑΘΗΜΑ 10 ο Πίνακες Κατακερματισμού Υλικό εκτός εξετάσιμης ύλης για την χρονιά
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
JAVA Objects & The Comparable Interface The MyComparable Interface. The MyComparable Interface. The AbstractObject class. The AbstractObject class. Wrapper.
Collision Resolution: Open Addressing
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
1 Binary Search Trees (BST) What is a Binary search tree? Why Binary search trees? Binary search tree implementation Insertion in a BST Deletion from a.
CS2420: Lecture 33 Vladimir Kulyukin Computer Science Department Utah State University.
Theory I Algorithm Design and Analysis (6 Hashing: Chaining) Prof. Th. Ottmann.
Hash Tables1 Part E Hash Tables  
Dictionaries 4/17/2017 3:23 PM Hash Tables  
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hashing: Collision Resolution Schemes
Chapter 19 Java Data Structures
COSC 2007 Data Structures II
Java™ How to Program, 9/e Presented by: Dr. José M. Reyes Álamo © Copyright by Pearson Education, Inc. All Rights Reserved.
Min Chen School of Computer Science and Engineering Seoul National University Data Structure: Chapter 10.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hash Tables1   © 2010 Goodrich, Tamassia.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
The Map ADT and Hash Tables. 2 The Map ADT  Map: An abstract data type where a value is "mapped" to a unique key  Need a key and a value to insert new.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Hashing is a method to store data in an array so that sorting, searching, inserting and deleting data is fast. For this every record needs unique key.
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
Building Java Programs Bonus Slides Hashing. 2 Recall: ADTs (11.1) abstract data type (ADT): A specification of a collection of data and the operations.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
U n i v e r s i t y o f H a i l 1 ICS 202  2011 spring  Data Structures and Algorithms 
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and Skylight.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Day 3: The Command and Visitor Patterns. Preliminaries The Java static type system uses simple rules to infer types for Java expressions. The inferred.
Implementing the Map ADT.  The Map ADT  Implementation with Java Generics  A Hash Function  translation of a string key into an integer  Consider.
Building Java Programs Generics, hashing reading: 18.1.
Hashing: Collision Resolution Schemes
Sets and Maps Chapter 9.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Collision Resolution: Open Addressing
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Chapter 28 Hashing.
null, true, and false are also reserved.
Chapter 21 Hashing: Implementing Dictionaries and Sets
CSCE 3110 Data Structures & Algorithm Analysis
Sets and Maps Chapter 9.
Collision Resolution: Open Addressing
Hashing: Collision Resolution Schemes
Collision Resolution: Open Addressing
Collision Resolution: Open Addressing
Presentation transcript:

1 Hashing Techniques: Implementation Implementing Hash Functions Implementing Hash Tables Implementing Chained Hash Tables Implementing Open Hash Tables

2 Implementing Hash Functions Recall that search keys can be letters, strings or structures like records/containers. Thus we make the search key as general as possible, of type Object, in our implementation. Then, can we simultaneously represent both basic values and objects? Define wrappers for the basic values to make the hash function generic. Java’s wrapper classes are not adequate for our purpose.

3 Implementing Hash Functions (cont'd) For convenience, we represent our hash function h as the composition of two functions f and g, i.e., h = g o f. That is, given f and g and a key k, the hash value of k is: h(k) = g(f(k)). Decomposing h provides a good separation of concern. Ideally, f and g should be independent of each other. With care, we can design f and g in such a way that the composition h = g o f is a good hash function.

4 Implementing Hash Functions (cont'd) For a set K of keys and the set Z of non-negative integers, we define f : f: K -> Z The function g maps Z into {0,1,…,n-1}: g: Z -> {0, 1, …, n-1} Since Java’s Object class defines a method hashCode(), we can carefully define f to correspond to it. Note that the choice of f depends on the characteristics of its domain, the key set K.

5 Hash Functions with Integral Keys The Integral data types in Java are byte, short, int, long and char Their underlying implementation can be viewed as integer, and we have: f (k) = k The wrapper class for these types can be written: 1 public class Int extends AbstractObject{ 2 protected int value; 3 public int hashCode(){ 4 return value; 5 } 6} In this case the hashCode method simply returns the content of the value field.

6 Hash Functions with String Keys A string in Java is a sequence of characters. There are many ways of defining functions of strings that return integers. One way is the folding technique achieved by summing the ASCII values of the characters in a string. That is, given a string s = c 0 c 1 c 2 …c n-1, we have f(s) = c 0 + c 1 + c 2 + … + c n-1 mod n Since f can easily overflow, we use the alternative: char[0] + (27 * char[1]) + (729 * char[2])mod n

7 Hash Functions with String Keys (cont'd) 1public class Str extends AbstractObject { 2 protected String value; 3 private size = value.length(); 4 public int hashCode() { 5int hashVal = 0; 6return (value.charAt(0) + 7 (27 * value.charAt(1)) + 8 (729 * value.charAt(2))) % size; 9 } 10}

8 Implementing Hash Tables: The Hierarchy Tree MyComparable Container SearchableContainer HashTable AbstractObject AbstractContainer AbstractSearchableContainer AbstractHashTable ChainedHashTable OpenScatterTable

9 The HashTable Interface From the introductory sessions, the interface HashTable extends the SearchableContainer interface. It has the following definition: 1public interface HashTable 2 extends SearchableContainer { 3 double getLoadFactor(); 4 } HashTable introduces getLoadFactor() which returns the load factor of the hash table. An AbstractHashTable is defined from which concrete hash table implementations will result.

10 The AbstractHashTable Class 1 public abstract class AbstractHashTable 2 extends AbstractSearchableContainer 3 implements HashTable { 5 public abstract int getLength(); 7 protected final int f(Object obj) { 8 return obj.hashCode(); 9 } 10 protected final int g(int i) { 11 return Math.abs(i) % getLength(); 12 } 13 protected final int h(Object obj) { 14 return g(f(obj)); 15 } 16 // }

11 The ChainedHashTable 1public class ChainedHashTable 2extends AbstractHashTable { 3protected LinkedList [] array; 4public ChainedHashTable(int size) { 5array = new LinkedList[size]; 6for(int j = 0; j < size; j++) 7array[j] = new LinkedList(); 8} 9public int getLength() { 10return array.length; 11} 12public void purge() { 13for(int i = 0; i < getLength(); i++) 14array[i].purge(); 15super.count = 0; 16} 17}

12 Chained Hash Tables: Insertion & Deletion 1public class ChainedHashTable 2 extends AbstractHashTable { 3 protected LinkedList [] array; 4 public void insert(Comparable comparable) { 5 array[h(comparable)].append(comparable); 6 super.count++; 7 } 8 public void withdraw(Comparable comparable) { 9 array[h(comparable)].extract(comparable); 10 super.count--; 11 } 12}

13 Chained Hash Tables: Retrieval 1public class ChainedHashTable extends 2AbstractHashTable { 3 protected LinkedList [] array; 4 public Comparable find(Comparable comparable){ 5 for(LinkedList.Element chain = 6 array[h(comparable)].getHead(); 7 chain != null; chain = 8 chain.getNext()){ 9 Comparable comparable1 = 10 (Comparable)chain.getDatum(); if(comparable.isEQ(comparable1)) 13 return comparable1; 14 } 15 return null; 16 } 17}

14 Open Hash Table Implementation 1 public class OpenScatterTable 2 extends AbstractHashTable { 3 4 protected Entry array[]; 5 static final int empty = 0; 6 static final int occupied = 1; 7 static final int deleted = 2; 8 9 protected static final class Entry { 10 int state; 11 Comparable object; 12 void purge(){ state = 0; object = null;} Entry() {state = 0;} 15 } 16 }

15 Open Hash Table: Initialization 1 public class OpenScatterTable 2 extends AbstractHashTable { 3 protected Entry array[]; 4 5 public OpenScatterTable(int size) { 6 7 array = new Entry[size]; 8 9 for(int j = 0; j < size; j++) 10 array[j] = new Entry(); 11 } 12 public void purge() { 13 for(int i = 0; i < getLength(); i++) 14 array[i].purge(); 15 super.count = 0; 16 } 17 }

16 Open Hash Tables: Finding Empty Cells 1 public class OpenScatterTable 2 extends AbstractHashTable { 3 4protected static int c(int i) { 5 return i; 6} 7 protected int findUnoccupied(Object obj) { 8 9 int i = h(obj); for(int j = 0; j < super.count + 1; j++) { 12 int k = (i + c(j)) % getLength(); 13 if(array[k].state != occupied) 14 return k; 15 } 16 throw new ContainerFullException(); 17 } 18}

17 Open Hash Tables: Insertion 1public class OpenScatterTable 2 extends AbstractHashTable { 3 4 public void insert(Comparable comparable) { 5 6 if(super.count == getLength()) { 7 throw new ContainerFullException(); 8 } else { 9 int i = findUnoccupied(comparable); 10 array[i].state = occupied; 11 array[i].object = comparable; 12 super.count++; 13 return; 14 } 15 } 16}

18 Open Hash Tables: Retrieval 1 public class OpenScatterTable 2 extends AbstractHashTable { 3 protected int findMatch(Comparable comparable) { 4 int i = h(comparable); 5 for(int j = 0; j < getLength(); j++) { 6 int k = (i + c(j)) % getLength(); 7 if(array[k].state == empty) break; 8 if(array[k].state == occupied && 9 comparable.isEQ(array[k].object)) 10 return k; 11 } 11 return -1; 12 } 13 public Comparable find(Comparable comparable) { 14 int i = findMatch(comparable); 15 if(i >= 0) return array[i].object; 16 else return null; 17 } 18 }

19 Open Hash Tables: Deletion 1 public class OpenScatterTable 2 extends AbstractHashTable { 3 4 public void withdraw(Comparable comparable) { 5 if(super.count == 0) 6 throw new ContainerEmptyException(); 7 int i = findInstance(comparable); 8 if(i < 0) { 9 throw 10new IllegalArgumentException( 11 "object not found"); 12 } else { 13 array[i].state = 2; 14 array[i].object = null; 15 super.count--; 16 return; 17 } 18 } 19 }

20 Review Questions Why are the methods f, g and h on Page 9 made final? Would someone not like to override them? Compare the insert() method of the ChainedHashTable class and that of OpenScatterTable. Are they both correct? How do you modify the findUnoccupied() method and ensure that it is efficient? Why is the isEQ() method used in Line 12 of find() on Page 13 instead of Java Object's equals() method? The method withdraw() of OpenScatterTable will be very inefficient if there are many deletions. Why?

21 Exercises 1. Why does the Int wrapper class, in particular, extends the AbstractObject class? 2. Are the wrapper classes Int and Str concrete classes or abstract classes? Why? 3. What does the hashCode() method of Object class return by default? Why do we have to redefine it in our hash table implementation? 4. If the cost of inserting a record r in a hash table is O(1), what is the cost of deleting r? 5. Can the load factor in an open addressed hash table that is efficiently implemented be 1? Explain your answer. 6. In some hashing schemes, table size has no effect on collisions. Explain.