1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.

Slides:



Advertisements
Similar presentations
Hash Tables.
Advertisements

Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
© 2004 Goodrich, Tamassia Hash Tables1  
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Log Files. O(n) Data Structure Exercises 16.1.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Techniques.
Hashing CS 3358 Data Structures.
Dictionaries and Hash Tables1  
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Dictionaries 4/17/2017 3:23 PM Hash Tables  
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hash Tables1   © 2010 Goodrich, Tamassia.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
Nov 22, 2010IAT 2651 Java Collections. Nov 22, 2010IAT 2652 Data Structures  With a collection of data, we often want to do many things –Organize –Iterate.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
Hashing (part 2) CSE 2011 Winter March 2018.
Data Structures Using C++ 2E
Hashing, Hash Function, Collision & Deletion
Hashing CSE 2011 Winter July 2018.
Lecture No.43 Data Structures Dr. Sohail Aslam.
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
© 2013 Goodrich, Tamassia, Goldwasser
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Table.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Computer Science and Engineering
EE 312 Software Design and Implementation I
Data Structures – Week #7
What we learn with pleasure we never forget. Alfred Mercier
EE 312 Software Design and Implementation I
Presentation transcript:

1 HASHING Course teacher: Moona Kanwal

2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of number –Used in discreet maths, e.g graph theory, set theory –Used in Searching technique –Used in encryption methods

3 Hash Functions and Hash Tables Hashing has 2 major components –Hash function h –Hash Table Data Structure of size N A hash function h maps keys (a identifying element of record set) to hash value or hash key which refers to specific location in Hash table Example: h(x)  x mod N is a hash function for integer keys The integer h(x) is called the hash value of key x

4 Hash Functions and Hash Tables A hash table data structure is an array or array type ADTof some fixed size, containing the keys. An array in which records are not stored consecutively - their place of storage is calculated using the key and a hash function Key hash function array index

5 Hashed key: the result of applying a hash function to a key Keys and entries are scattered throughout the array Contains the main advantages of both Arrays and Trees Mainly the topic of hashing depends upon the two main factors / parts (a) Hash Function(b) Collision Resolution Table Size is also an factor (miner) in Hashing, which is 0 to tablesize-1.

6 Table Size Hash table size –Should be appropriate for the hash function used –Too big will waste memory; too small will increase collisions and may eventually force rehashing (copying into a larger table)

7 Example We design a hash table for a dictionary storing items (SSN, Name), where SSN (social security number) is a nine-digit positive integer The actual data is not stored in hash table Pin points the location of actual data or set of data Our hash table uses an array of size N  10,000 and the hash function h(x)  last four digits of x     …

8 Hash Function The mapping of keys into the table is called Hash Function A hash function, –Ideally, it should distribute keys and entries evenly throughout the table –It should be easy and quick to compute. –It should minimize collisions, where the position given by the hash function is already occupied –It should be applicable to all objects

9 Different types of hash functions are used for the mapping of keys into tables. (a) Division Method (b) Mid-square Method (c) Folding Method

10 1. Division Method Choose a number m larger than the number n of keys in k. The number m is usually chosen to be a prime no. The hash function H is defined as, H(k) = k(mod m) or H(k) = k(mod m) + 1 Denotes the remainder, when k is divided by m 2 nd formula is used when range is from 1 to m.

11 Example: Elements are: 3205, 7148, 2345 Table size: 0 – 99 (prime) m = 97 (prime) H(3205)= 4,H(7148)=67,H(2345)=17 For 2 nd formula add 1 into the remainders.

12 2. Folding Method The key k is partitioned into no. of parts Then add these parts together and ignoring the last carry. One can also reverse the first part before adding (right or left justified. Mostly right) H(k) = k 1 + k 2 + ………. + k n

13 Example: H(3205)=32+05=37 or H(3250)=32+50=82 H(7148)=71+43=19 or H(7184)=71+84=55 H(2345)=23+45=77 or H(2354)=23+54=68

14 3. Mid-Square Method The key k is squared. Then the hash function H is defined as H(k) = l The l is obtained by deleting the digits from both ends of K 2. The same position must be used for all the keys.

15 Example: k: k 2 : H(k): th and 5 th digits have been selected. From the right side.

16 Collision Resolution Strategies If two keys map on the same hash table index then we have a collision. As the number of elements in the table increases, the likelihood of a collision increases - so make the table as large as practical Collisions may still happen, so we need a collision resolution strategy

17 Two approaches are used to resolve collisions. (a) Separate chaining: chain together several keys/entries in each position. (b) Open addressing: store the key/entry in a different position. Probing: If the table position given by the hashed key is already occupied, increase the position by some amount, until an empty position is found

18 Open Addressing Types of open addressing are 1. Linear Probing 2. Quadratic Probing 3. Double Hashing.

19 1. Linear Probing Locations are checked from the hash location k to the end of the table and the element is placed in the first empty slot If the bottom of the table is reached, checking “wraps around” to the start of the table. Modulus is used for this purpose Thus, if linear probing is used, these routines must continue down the table until a match or empty location is found

20 Linear probing is guaranteed to find a slot for the insertion if there still an empty slot in the table. Even though the hash table size is a prime number is probably not an appropriate size; the size should be at least 30% larger than the maximum number of elements ever to be stored in the table. If the load factor is greater than 50% - 70% then the time to search or to add a record will increase.

21 H(k)=h, h+1, h+2, h+3,……, h+I However, linear probing also tends to promote clustering within the table

22 2. Quadratic Probing Quadratic probing is a solution to the clustering problem –Linear probing adds 1, 2, 3, etc. to the original hashed key –Quadratic probing adds 1 2, 2 2, 3 2 etc. to the original hashed key However, whereas linear probing guarantees that all empty positions will be examined if necessary, quadratic probing does not

23 If the table size is prime, this will try approximately half the table slots. More generally, with quadratic probing, insertion may be impossible if the table is more than half-full! H(k) = h, h+1, h+4, h+5, h+6,……, h+i 2

24 3. Double Hashing 2 nd hash function H’ is used to resolve the collision. Here H’(k) = h’ ≠ m Therefore we can search the locations with addresses, H’(k) = h, h+h’, h+2h’, h+3h’,……. If m is prime, then this sequence access all the locations.

25 Double Hashing Double hashing uses a secondary hash function d(k) and handles collisions by placing an item in the first available cell of the series (h  jd(k)) mod N for j  0, 1, …, N  1 The secondary hash function d ( k ) cannot have zero values The table size N must be a prime to allow probing of all the cells Common choice of compression map for the secondary hash function: d 2 ( k )  k mod q where –q  N –q is a prime The possible values for d 2 ( k ) are 1, 2, …, q

26 Consider a hash table storing integer keys that handles collision with double hashing –N  13 –h(k)  k mod 13 –d(k)  k mod 7 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order Example of Double Hashing

27 Applications of Hashing Compilers use hash tables to keep track of declared variables A hash table can be used for on-line spelling checkers — if misspelling detection (rather than correction) is important, an entire dictionary can be hashed and words checked in constant time Game playing programs use hash tables to store seen positions, thereby saving computation time if the position is encountered again Hash functions can be used to quickly check for inequality — if two elements hash to different values they must be different