Ihab Mohammed and Safaa Alwajidi. Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Hash Tables.
Hash Tables CIS 606 Spring 2010.
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing Part Two Better Collision Resolution Small parts of this material stolen from "File Organization and Access" by Austing and Cassel.
Log Files. O(n) Data Structure Exercises 16.1.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing CS 3358 Data Structures.
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.
Hash Tables How well do hash tables support dynamic set operations? Implementations –Direct address –Hash functions Collision resolution methods –Universal.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
11.Hash Tables Hsu, Lih-Hsing. Computer Theory Lab. Chapter 11P Directed-address tables Direct addressing is a simple technique that works well.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Quick Review of material covered Apr 8 B+-Tree Overview and some definitions –balanced tree –multi-level –reorganizes itself on insertion and deletion.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Tirgul 8 Hash Tables (continued) Reminder Examples.
Lecture 10: Search Structures and Hashing
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
Hashing General idea: Get a large array
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Data Structures Hashing Uri Zwick January 2014.
Spring 2015 Lecture 6: Hash Tables
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Week 9 - Monday.  What did we talk about last time?  Practiced with red-black trees  AVL trees  Balanced add.
Hashing TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA Course: Data Structures Lecturer: Haim Kaplan and Uri Zwick.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hash table CSC317 We have elements with key and satellite data
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing Alexandra Stefan.
EEE2108: Programming for Engineers Chapter 8. Hashing
Hashing Alexandra Stefan.
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash Table.
Resolving collisions: Open addressing
Hashing Alexandra Stefan.
CS202 - Fundamental Structures of Computer Science II
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
CS 3343: Analysis of Algorithms
Data Structures and Algorithm Analysis Hashing
Presentation transcript:

Ihab Mohammed and Safaa Alwajidi

Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table idea is as follows: First, we have a universe U of n objects and each object have a key. Second, we want to store a set of U named S in a structure in the computer which have m buckets (locations) S U Object 1Object 2….Object n key Object 1Object 2….Object m

Introduction Third, a hash function h which maps a key u from the set U to a location of an object in S: S[h(u)] = Object Now to see things clearly, lets have an example: if U is the set of integer numbers from 1 to 100 and we want to store a set of ten of these numbers in the hash table S using module 10 operation so: h(u) = u mod 10 Now lets try to store the number 33: h(33) = 33 mod 10 = 3 which means that the number 33 is stored in location 3 in S

Collision Problem To store the number 56: h(56) = 56 mod 10 = 6 which means that the number 56 is stored in location 6 in S Watch what happens when we try to store the number 43: h(43) = 43 mod 10 = 3 but location 3 already have the value 33 so we have a collision

Collision Solution Hash tables theory is about: Solve collisions. Choose a hashing function that reduce collisions. History: Started in 1953 by some groups in IBM and it has a simple implementation since the hash function was simple with no performance guarantees. If we map the keys of a big universe U to a small set S = {0,..., s − 1}, then it is unavoidable that many universe elements are mapped to the same element of S.

Hash Table Design Collisions can be solved in two ways: Chaining: Use a link list in the location were collisions occur to store multiple (collided) objects. Open Addressing: using a sequence of alternative addresses for same key u so if h1(u) is used then when collision occur use h2(u) hen h3(u) and so on.

Chaining In chaining, hash function takes you to the correct location in the primary structure (array), then you have to search in the secondary structure (link list) for the correct object. A balance search tree can be used as a secondary structure and searching time is O(log n) so total searching time: O(1) by hash function + O(log n) by BST Note: With a good hash function and a not-too-small hash table most buckets are expected to be almost empty which shows that link list as a secondary search structure is enough.

Open Addressing In open addressing, no secondary structure is needed which makes it a simple method. However, this method does not support deletion!? To store an object, a sequence of addresses (hash function) is called until an empty location is found. h1 (43) h2 (43) h1 (73) h2 (73) h3 (73) Insert object Insert object Search for Object 43 is deleted Search for object 73 h1 (43) h2 (43) h1 (73) h2 (73) Huston, we have a problem!

Back to Future (Chaining) Avoid open addressing: the small space advantage of avoiding pointers does never outweigh the fundamental disadvantage of losing deletions Chaining Variants: Two Way Chaining: each element of the universe is assigned to two possible buckets and objects are inserted to the bucket with fewer elements. Sequence of Hash Tables: if entry in the first hash table is used then go to second hash table and use a different hash function and so on. Also, this is convenient for parallelization.

Universal Families of Hash Functions Uniform Hashing Model: hash values of the elements are independent random values, uniformly distributed on the available addresses (Up to the end of the 1970s). Any hash function that is complicated enough will behave like a random assignment, mixing the values of the input set sufficiently well The above is incorrect because the set of values is concrete and not uniformly distributed in the universe U.

Universal Families of Hash Functions Choose randomly a hash function from a family of hash functions in which for any input set the values of the hash functions are well distributed with high probability. For any choice of hash function, there exists a bad set of keys that all hash to same slot Crucial Property: for a family of hash functions F to distribute a set of U well over S, choose a function f belongs to F uniformly at random that satisfies the following:

Universal Families of Hash Functions

Family Example Properties of a family of universal hash functions: it must be small and have a convenient parameterization, so we can easily select the random function from this family. it must be easy to evaluate. Assume that U = {0, …, p-1} for some large prime p chosen as slightly less than the square root of maximum integer the machine arithmetic can handle (product of two such numbers). Assume that S = {0, …, s-1} with s <= p. Now, the simplest family is: Fps = {ha: U → S | ha(x) = (ax mod p) mod s, 1 ≤ a ≤ p − 1} There are p-1 functions

Universal Families of Hash Functions Summery: Theorem: the hash table with chaining, using a universal family of hash functions, stores a set of n elements in a table of size s, supporting the operations find, insert, and delete, in expected time O(1 + n/s) for each operation and requires space O(n + s).