Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner.

Slides:



Advertisements
Similar presentations
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Advertisements

ArrayLists David Kauchak cs201 Spring Extendable array Arrays store data in sequential locations in memory Elements are accessed via their index.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing as a Dictionary Implementation
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 3 Tuesday, 2/9/10 Amortized Analysis.
CPSC 411, Fall 2008: Set 6 1 CPSC 411 Design and Analysis of Algorithms Set 6: Amortized Analysis Prof. Jennifer Welch Fall 2008.
Hash Table indexing and Secondary Storage Hashing.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2005 Lecture 3 Tuesday, 2/8/05 Amortized Analysis.
Tirgul 9 Amortized analysis Graph representation.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2009 Lecture 3 Tuesday, 2/10/09 Amortized Analysis.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Theory I Algorithm Design and Analysis (8 – Dynamic tables) Prof. Th. Ottmann.
Hashing (Ch. 14) Goal: to implement a symbol table or dictionary (insert, delete, search)  What if you don’t need ordered keys--pred, succ, sort, select?
CS333 / Cutler Amortized Analysis 1 Amortized Analysis The average cost of a sequence of n operations on a given Data Structure. Aggregate Analysis Accounting.
Andreas Klappenecker [based on the slides of Prof. Welch]
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
17.Amortized analysis Hsu, Lih-Hsing. Computer Theory Lab. Chapter 17P.2 The time required to perform a sequence of data structure operations in average.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Lecture 2 (Part 2) Tuesday, 9/11/01 Amortized Analysis.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
CS 473Lecture 121 CS473-Algorithms I Lecture 12 Amortized Analysis.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 14 Prof. Charles E. Leiserson.
Amortized Analysis The problem domains vary widely, so this approach is not tied to any single data structure The goal is to guarantee the average performance.
Amortized Analysis Typically, most data structures provide absolute guarantees on the worst case time for performing a single operation. We will study.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Chapter 5: Hashing Collision Resolution: Separate Chaining Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova, Simpson College.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
CSC 427: Data Structures and Algorithm Analysis
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2004 Simonas Šaltenis E1-215b
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashing Fundamental Data Structures and Algorithms Margaret Reid-Miller 18 January 2005.
Amortized Analysis Some of the slides are from Prof. Leong Hon Wai’s resources at National University of Singapore.
CSCE 411H Design and Analysis of Algorithms Set 6: Amortized Analysis Prof. Evdokia Nikolova* Spring 2013 CSCE 411H, Spring 2013: Set 6 1 * Slides adapted.
CMSC 341 Amortized Analysis.
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Amortized Analysis. Problem What is the time complexity of n insert operations into an dynamic array, which doubles its size each time it is completely.
Introduction to Algorithms Amortized Analysis My T. UF.
Amortized Analysis In amortized analysis, the time required to perform a sequence of operations is averaged over all the operations performed Aggregate.
Amortized Analysis and Heaps Intro David Kauchak cs302 Spring 2013.
Introduction to Algorithms: Amortized Analysis. Introduction to Algorithms Amortized Analysis Dynamic tables Aggregate method Accounting method Potential.
Amortized Analysis.
Andreas Klappenecker [partially based on the slides of Prof. Welch]
CSCE 411 Design and Analysis of Algorithms
Presentation by Marty Krogel
Amortized Analysis The problem domains vary widely, so this approach is not tied to any single data structure The goal is to guarantee the average performance.
Chapter 17 Amortized Analysis Lee, Hsiu-Hui
Hashing Exercises.
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
CSCE 411 Design and Analysis of Algorithms
CSCE 411 Design and Analysis of Algorithms
Lecture-Hashing.
Presentation transcript:

Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner

Python Dictionaries Same idea as Java/C++ hash map, C# dictionary, Perl hash, … How is this “magic” implemented? 2 d = {“snow” : 7, “apple” : 3, “white” : 20} d[“white”] = 30 d[“prince”] = 16 d[“apple”]

Combination of Ideas Hash table & hash function Dynamic table & amortization 3 To do: [CLRS] 11,17 #4

Hash Tables & Hash Functions 4 “snow” hash 3 “apple” hash 7 “prince” hash 3 “white” hash 2 d = {“snow” : 7, “apple” : 3, “white” : 20} d[“white”] = 29 d[“prince”] = 16 d[“apple”] 0 9 “white”:29 “apple”:3 “snow”:7, “prince”: Chains:

Access Time Depends on Chain Length ? What property do we want of our hash function? ? ? How long is each chain? ? ? How much time per access? ? 5

Creating a Good Hash Function is Difficult Generally, just use those in libraries. key.__hash__() % table_size 6 class string: def __hash__(self): if not self: return 0 # empty value = ord(self[0]) << 7 for char in self: value = c_mul( , value) ^ ord(char) value = value ^ len(self) if value == -1: # reserved error code value = -2 return value

Dynamic Table Motivation Typically, don’t know how much data we’ll have. –Want underlying hash table to grow, so average chain size is bounded. –Want to retain constant-time indexing. Focus on the latter goal first. –We’ll have to do a little extra to combine hash tables & dynamic tables. 7

Adding Data when Dynamic Table is Full Must be contiguous for constant-time indexing. 8 data

Adding Data when Dynamic Table is Full What’s wrong with just using space at end of array? 9 data other data That memory might already be in use.

Adding Data when Dynamic Table is Full So, grab needed space elsewhere & copy everything. 10 data copy Double the space.

Cost of a Series of Operations Initially: Table size = 5, table empty 11

Cost of a Series of Operations Add 5 data items, cost =

Cost of a Series of Operations Add 5 more data items, cost = copy

Cost of a Series of Operations Add 5 more data items, cost = copy

Cost of a Series of Operations Add 5 more data items, cost = Total costs: Copying = 15 Adding = 20 Total = 35

Cost of Copying is Proportional to Adding 16 Each copy costs 2 × #adds since previous copy. copy Use expensive ops sufficiently infrequently. Buy Add 1 Get Copy 2 free!

Amortized Cost Dynamic tables: O(1) amortized time to add data 17 Cost of series of n operations = m  Each operation has amortized cost = m/n

Dynamic Hash Tables 18 d = {“snow” : 7, “apple” : 3, “white” : 20} d[“white”] = 29 d[“prince”] = 16 d[“apple”] “snow” hash 3 “apple” hash 7 “prince” hash 3 “white” hash 2 “white”:29 “apple”:3 “snow”:7, “prince”:16 Must rehash all data. Hash table “full enough”. Load factor = #items/size “snow” hash 3 “apple” hash 2 “prince” hash 1 “white” hash 2 “apple”:3,“white”:29 “snow”:7 “prince”:16

Hash Table & Dynamic Table Odds & Ends Hash table size: prime or power-of-2? Chaining vs. open addressing Dynamic table expansion factor Dynamic table contraction Python’s dictionary/hash table implementation 19 Hash tables:Dynamic tables: Python dict. & set … Python list … Memory heaps for GC When want static memory size.

Multipop 20 3 ops: Push(S,x)Pop(S)Multi-pop(S,k) Worst-case cost: O(1) O(min(|S|,k) = O(n) Amortized cost: ?

Incrementing Binary Counter 21 CounterBits changed in increment Bit position # times bit changes Another way to sum the costs?

Amortized Analysis Approaches 22 Aggregate: 1.For all sequences of m ops, find maximum sum of actual costs. Potentially difficult to know actual costs or bound well. 2.Result is sum/m. Sufficient for many commonly-used examples.

Amortized Analysis Approaches 23 Accounting: 1.Compute actual costs c i of each kind of op. 2.Define accounting costs ĉ i of each kind of op, such that can assign credits ĉ i -c i consistently to data elts. Can “overpay” on some ops, and use credits to “underpay” on other ops later. Poor definitions lead to loose bounds. 3.O(max acct. cost.)

Amortized Analysis Approaches 24 Potential: 1.Define potential function Φ(D), such that Φ(D i )≥ Φ(D 0 ). Essentially assigns “credits” to data structure, rather than operations. Poor definition leads to loose bounds. 2.Calculate accounting costs ĉ i =c i +ΔΦ of each kind of op. 3.O(max acct. cost.) Complicated approach necessary for more complicated data structures.