1 CSE 331 Hash codes; annotations slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R. Mercer, Wikipedia

Slides:



Advertisements
Similar presentations
Hashing as a Dictionary Implementation
Advertisements

Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Java Annotations. Annotations  Annotations are metadata or data about data. An annotation indicates that the declared element should be processed in.
Maps & Hashing Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hash Tables1 Part E Hash Tables  
Hashing Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Hash Tables1 Part E Hash Tables  
1 Java Object Model Part 2: the Object class. 2 Object class Superclass for all Java classes Any class without explicit extends clause is a direct subclass.
Hash Tables1 Part E Hash Tables  
CSE 373 Data Structures and Algorithms Lecture 18: Hashing III.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hashing Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Chapter 19 Java Data Structures
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
CS2110: SW Development Methods Textbook readings: MSD, Chapter 8 (Sect. 8.1 and 8.2) But we won’t implement our own, so study the section on Java’s Map.
Data structures and algorithms in the collection framework 1 Part 2.
COMP 103 Hashing 2013-T2 Lecture 28 Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
Min Chen School of Computer Science and Engineering Seoul National University Data Structure: Chapter 10.
Big Java Chapter 16.
Not overriding equals  what happens if you do not override equals for a value type class?  all of the Java collections will fail in confusing ways 1.
CSS446 Spring 2014 Nan Wang.  Java Collection Framework ◦ Set ◦ Map 2.
Hash Tables1   © 2010 Goodrich, Tamassia.
Chapter 18 Java Collections Framework
Sets, Maps and Hash Tables. RHS – SOC 2 Sets We have learned that different data struc- tures have different advantages – and drawbacks Choosing the proper.
COMP 103 Hashing. 2 RECAP-TODAY RECAP Bitmaps are a fast way to implement Sets of integers, characters, etc TODAY  Hashing is a similar idea  Detecting.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
HashCode() 1  if you override equals() you must override hashCode()  otherwise, the hashed containers won't work properly  recall that we did not override.
1 CSC 427: Data Structures and Algorithm Analysis Fall 2011 Space vs. time  space/time tradeoffs  hashing  hash table, hash function  linear probing.
SETS AND HASHING. SETS An un-ordered collection of values Operations (S and T are sets): S ∩ T // the intersection of S and T S U T // The Union of S.
Building Java Programs Bonus Slides Hashing. 2 Recall: ADTs (11.1) abstract data type (ADT): A specification of a collection of data and the operations.
1 CSE 331 Generics (Parametric Polymorphism) slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R. Mercer, Wikipedia
Java Annotations. Annotations  Annotations are metadata or data about data. An annotation indicates that the declared element should be processed in.
Chapter 11 Sets © 2006 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
1 CSE 331 The Object class; Object equality and the equals method slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R.
1 CSE 331 Comparing objects; Comparable, compareTo, and Comparator slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R.
Building Java Programs Generics, hashing reading: 18.1.
Sets and Maps Chapter 9.
Efficiency add remove find unsorted array O(1) O(n) sorted array
Searching.
Building Java Programs
CSE 1030: Implementing Non-Static Features
slides created by Marty Stepp and Hélène Martin
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CSE 373: Data Structures and Algorithms
CSE 143 Lecture 24 Inheritance and the Object class; Polymorphism
slides adapted from Marty Stepp and Hélène Martin
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CSE 331 Annotations slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R. Mercer, Wikipedia
CSE 143 Lecture 25 Set ADT implementation; hashing read 11.2
Sets and Maps Chapter 9.
CSE 373 Separate chaining; hash codes; hash maps
slides created by Marty Stepp
Building Java Programs
CSE 143 Lecture 24 Inheritance and the Object class; Polymorphism
CSE 143 Lecture 24 Inheritance and the Object class; Polymorphism
Hashing based on slides by Marty Stepp
slides created by Marty Stepp and Hélène Martin
Java Annotations.
CSE 143 Lecture 23 Inheritance and the Object class; Polymorphism
CSE 373 Set implementation; intro to hashing
Dictionaries and Hash Tables
Presentation transcript:

1 CSE 331 Hash codes; annotations slides created by Marty Stepp based on materials by M. Ernst, S. Reges, D. Notkin, R. Mercer, Wikipedia

2 Surprising result #1 Point p = new Point(3, 4); Set set = new HashSet (); set.add(p); System.out.println(set.contains(new Point(3, 4))); // true p.translate(2, 2); System.out.println(set.contains(new Point(5, 6))); // false Where did p go? What is wrong?

3 Hashing hash: To map a value to a specific integer index.  hash table: An array that stores elements via hashing. The internal data structure used by HashSet and HashMap.  hash function: An algorithm that maps values to indexes. A possible hash function for integers: HF(i)  i % length set.add(11);// 11 % 10 == 1 set.add(49);// 49 % 10 == 9 set.add(24);// 24 % 10 == 4 set.add(7);// 7 % 10 == 7  What is a hash function for a string? for other kinds of objects? index value

4 Efficiency of hashing public static int HF(int i) { // int hash function return Math.abs(i) % elements.length; } Add: simply set elements[HF(i)] = i; Search: check if elements[HF(i)] == i Remove: set elements[HF(i)] = 0; Runtime of add, contains, and remove : O(1)! Are there any potential problems with hashing?  collisions: Multiple element values can map to the same bucket.

5 Chaining chaining: Resolving collisions by storing a list at each index.  Add/search/remove must traverse lists, but the lists are short.  Impossible to "run out" of indexes, unlike with probing.  Alternative to chaining: probing (choosing the next available index). index value

6 The hashCode method From the Object class: public int hashCode() Returns an integer hash code for this object.  We can call hashCode on any object to find the index where it "prefers" to be placed in a hash table.

7 Hash function for objects public static int HF(Object o) { return Math.abs(o.hashCode()) % elements.length; } Add: simply set elements[HF(o)] = o; Search: check if elements[HF(o)].equals(o) Remove: set elements[HF(o)] = null;

8 Surprising result #1 Point p = new Point(3, 4); Set set = new HashSet (); set.add(p); System.out.println(set.contains(new Point(3, 4))); // true p.translate(2, 2); System.out.println(set.contains(new Point(5, 6))); // false  The code breaks because the point is put into a certain bucket when its state is (3, 4), but it isn't in the bucket that is returned when hashCode is called on an object with state (5, 6). index value (3, 4) HF(p) when p is (3, 4) == 5HF(p) when p is (5, 6) == 8

9 Surprising result #2 // assuming that Time is a class we have written, // and that Time does have a proper equals method Time t1 = new Time(11, 30, true); Time t2 = new Time(11, 30, true); Set set = new HashSet (); set.add(t1); System.out.println(set.contains(t1)); // true System.out.println(set.contains(t2)); // false What is wrong?

10 Implementing hashCode hashCode 's implementation depends on the object's type/state.  A String's hashCode method adds the ASCII values of its letters.  A Point's hashCode produces a weighted sum of its x/y coordinates.  A Double's hashCode converts the number into bits and returns that.  A collection's hashCode combines the hash codes of its elements. You can override hashCode in your classes.  Effective Java Tip #9: Always override hashCode when you override equals. The default implementation from class Object just uses the object's memory address to produce the integer code.  Why might this not be ideal?

11 The hashCode contract The general contract of hashCode is that it must be:  Self-consistent (produces the same results on each call): o.hashCode() == o.hashCode()...so long as o doesn't change between the calls  Consistent with equality: a.equals(b) implies that a.hashCode() == b.hashCode() !a.equals(b) does NOT necessarily imply that a.hashCode() != b.hashCode() (why not?)

12 Surprising result #2 // assuming that Time is a class we have written, // and that Time does have a proper equals method Time t1 = new Time(11, 30, "AM"); Time t2 = new Time(11, 30, "AM"); Set set = new HashSet (); set.add(t1); System.out.println(set.contains(t1)); // true System.out.println(set.contains(t2)); // false  The code breaks because Time has no hashCode method, so each object's hash code is just its memory address. This is inconsistent with equals and so it cannot find seemingly equal objects in the set. index value 11:30 AM HF(t1) == 2 HF(t2) == 7

13 hashCode implementation 1 Possible implementation of hashCode for Time objects: public int hashCode() { return new Random().nextInt(); }  Does this meet the general contract of hashCode ?  Is this a good hashCode function? Why or why not? In what cases does this hashCode produce poor results?

14 hashCode implementation 2 Possible implementation of hashCode for Time objects: public int hashCode() { return 42; }  Does this meet the general contract of hashCode ?  Is this a good hashCode function? Why or why not? In what cases does this hashCode produce poor results?

15 hashCode implementation 3 Possible implementation of hashCode for Time objects: public int hashCode() { return hour + minute; }  Does this meet the general contract of hashCode ?  Is this a good hashCode function? Why or why not? In what cases does this hashCode produce poor results?

16 hashCode implementation 4 Recommended implementation of hashCode for Time objects: public int hashCode() { return * amPm.hashCode() + 67 * hour + minute; }  All fields of the object should be incorporated into the hash code.  The code should weight each field by multiplying them by various prime numbers to reduce collisions between unequal objects. e.g. Don't want 11:05 AM to collide with 5:11 PM if possible. We prefer to multiply by primes because they wrap more unevenly when they exceed the array's size.

17 hashCode tricks If one of your object's fields is an object, call its hashCode : public int hashCode() { // TimeSpan return * amPm.hashCode() +...; } To incorporate an array, use Arrays.hashCode. private String[] addresses; public int hashCode() { return 3137 * Arrays.hashCode(addresses) +... } // also Arrays.deepHashCode for multi-dim arrays

18 hashCode tricks 2 To incorporate a double or boolean, use the hashCode method from the Double or Boolean wrapper classes: public int hashCode() { // BankAccount return 37 * new Double(balance).hashCode() + new Boolean(isCheckingAccount).hashCode() +...; } If your hash code is expensive to compute, consider caching it. private int myHashCode =...; // pre-compute once public int hashCode() { return myHashCode; }

19 String's hashCode The hashCode function for String objects looks like this: public int hashCode() { int hash = 0; for (int i = 0; i < this.length(); i++) { hash = 31 * hash + this.charAt(i); } return hash; }  Early versions of the Java examined only the first 16 characters. For some common data this led to poor hash table performance.  As with any general hashing function, collisions are possible. Example: "Ea" and "FB" have the same hash value.

20 Annotations

21 Annotations annotation: Markup that provides information to the compiler.  Can also be used for deployment-time or run-time processing. Common uses for annotations:  To detect problems or errors in code  To suppress compiler warnings  For unit tests, e.g. JUnit

22 Annotation AnnotationName ( param = value,..., param = value An annotation can be placed on:  a class  a method  a field  a local variable,...

23 Common annotations The following annotation types come with the JDK: turn off compiler a superclass's method that is being code that is discouraged from sets an annotation to appear in makes annotation data available at runtime

24 Whenever you override a superclass's method, such as equals or hashCode, you should annotate it  If you do, the compiler will produce an error if you don't override it properly (misspell the name, wrong parameter/return types, public boolean equal(Object other) {... // error; should be 'equals'  This also applies to methods you implement from an public int compareTo(Time other) {...

25 Creating an annotation type Name {} Example: GradingScript public class TestElection {...} Most programmers don't commonly need to create annotations.

26 An annotation with params Name { type name ();// parameters type name () default value ;// optional } Example: ClassPreamble { String author(); String date(); int currentRevision() default 1; String lastModified() default "N/A"; String[] reviewers(); }

27 Using custom author = "John Doe", date = "3/17/2002", currentRevision = 6, lastModified = "4/12/2004", reviewers = {"Alice", "Bob", "Cindy"} ) public class FamilyTree {... }

28 Prof. Ernst's annotations UW's own Prof. Michael Ernst and his research team have contributed a set of custom annotations that can be used to provide sophisticated type checking and nullness checking for Java: a class of objects that cannot a temporarily a value for which null can/cannot for security and for for for tagging regular for flyweighted objects