May 3rd – Hashing & Graphs

Slides:



Advertisements
Similar presentations
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Advertisements

Hashing Techniques.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
MA/CSSE 473 Day 28 Hashing review B-tree overview Dynamic Programming.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing by Rafael Jaffarove CS157b. Motivation  Fast data access  Search  Insertion  Deletion  Ideal seek time is O(1)
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Instructor: Lilian de Greef Quarter: Summer 2017
Sections 10.5 – 10.6 Hashing.
CE 221 Data Structures and Algorithms
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing.
Data Structures Using C++ 2E
Hash table CSC317 We have elements with key and satellite data
Hashing CSE 2011 Winter July 2018.
School of Computer Science and Engineering
Lecture No.43 Data Structures Dr. Sohail Aslam.
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing - resolving collisions
Hashing - Hash Maps and Hash Functions
October 20th – Hashing Conclusion + memory Analysis
Cse 373 April 24th – Hashing.
Cse 373 May 15th – Iterators.
October 30th – Priority QUeues
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Week 11 - Friday CS221.
Hashing Exercises.
Cse 373 April 26th – Exam Review.
November 1st – Exam Review
Hash functions Open addressing
Quadratic probing Double hashing Removal and open addressing Chaining
Advanced Associative Structures
Hash Table.
Hash Table.
Chapter 28 Hashing.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Data Structures and Algorithms
Chapter 21 Hashing: Implementing Dictionaries and Sets
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Data Structures and Algorithms
Hash Tables and Associative Containers
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
ITEC 2620M Introduction to Data Structures
Advanced Implementation of Tables
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Symbol Table 薛智文 (textbook ch#2.7 and 6.5) 薛智文 96 Spring.
Ch Hash Tables Array or linked list Binary search trees
slides created by Marty Stepp
Ch. 13 Hash Tables  .
Instructor: Dr. Michael Geiger Spring 2019 Lecture 34: Exam 3 Preview
17CS1102 DATA STRUCTURES © 2018 KLEF – The contents of this presentation are an intellectual and copyrighted property of KL University. ALL RIGHTS RESERVED.
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

May 3rd – Hashing & Graphs Cse 373 May 3rd – Hashing & Graphs

Assorted Minutiae Exams are in my office, you can pick them up during any of my office hours. HW4 out tomorrow morning Regrade requests processed H2P2 and H3P1 will be back next week Finishing discussion on hashing today, so section tomorrow will be all examples.

make Up Assignment Make up assignment end of Week 7 Choose either a coding assignment or a write-up assignment Will overwrite your lowest grade for either EC will be possible on these assignments, so it could give bonus points

Todays lecture Hashing considerations Introduction to graphs

Hash Function In reality, good hash functions are difficult to produce We want a hash that distributes our data evenly throughout the space Usually, our hash function returns some integer, which must then be modded to our table size Needs to incorporate all the data in the keys

Hash Function You will not have to produce hash functions, but you should recognize good ones They run in constant time They evenly distribute the data They return an integer These hash functions are chosen in advance, you should not pick a hash function relative to your data

Collisions Hash table methods are defined by how they handle collisions Two main approaches Probing Chaining

Collisions Probing Linear probing Try the appropriate hash table row first Increase the index by one until a spot is found Guaranteed to find a spot if it is available If the array is too full, its operations reach O(n) time

Collisions Probing Quadratic Probing Rather than increasing by one each time, we increase by the squares k+1, k+4, k+9, k+16, k+25 Certain tables can cause secondary clustering Can fail to insert if the table is over half full

Collisions Probing Secondary Hashing If two keys collide in the hash table, then a secondary hash indicates the probing size Need to be careful, possible for infinite loops with a very empty array

Collisions Chaining Rather than probing for an open position, we could just save multiple objects in the same position Some data structure is necessary here Commonly a linked list, AVL tree or secondary hash table. Resizing isn’t necessary, but if you don’t, you will get O(n) runtime.

primality Array sizes We normally choose our hash tables to have prime size This is because for any number we pick, so long as it is not a multiple of our table size, they must be coprime Two numbers x and y are coprime if they do not share any common factors. If the hash table size and the secondary hash value are coprime, then the search will succeed if there is space available However, many primes cause secondary clustering when used with quadratic probing

Load Factor When discussing hash table efficiency, we call the proportion of stored data to table size the load factor. It is represented by the Greek character lambda (λ). We’ve discussed this a bit implicitly before What are good load-factor (λ) values for each of our collision techniques?

Load Factor Linear Probing? Quadratic Probing? Secondary Hashing? Chaining? What are the tradeoffs? Memory efficiency Failure rate Access times?

Load Factor Linear Probing? 0.25 < λ < 0.5 Quadratic Probing? 0.10 < λ < 0.30 Secondary Hashing? 0.25 < λ < 0.5 Chaining? 3.0 < λ < 10 Because we allow multiple items in each space, we can increase memory efficiency by taking advantage As long as there are a constant number in each space, we get O(1) runtimes.

Load Factor As with most array data structures, you will need to resize when they get too full Here, these resizes are often for performance, rather than failure. Hash table maintenance is important Resizing is costly (but still O(n)) because you have to resize the array and rehash every element into the new table.

Hash tables Hash tables are a good overall data structure Can provide O(1) access times Can be memory inefficient Probing can fail, and delete with probing mechanisms is difficult Chaining can be a good balance, but there is a lot of overhead maintaining all those data structures

Hash tables Understand these tradeoffs and how these implementations work Section tomorrow will provide practice problems for each of these hash table methods

Graphs A graph is composed of two things A set of vertices A set of edges (which are vertex tuples) Trees are types of graphs Each of the nodes is a vertex Each pointer from parent to child is an edge Represented as G(V,E) to indicate that V is the set of vertices and E is the set of edges

Graphs C B A D What this graphs vertices and edges?

Graphs What this graphs vertices and edges? V = {A, B, C, D} E = {(A,B) , (A,C), (A,D)}

Graphs What this graphs vertices and edges? V = {A, B, C, D} E = {(B,A) , (C,A), (D,A)}?

Graphs In that graph, the order did not matter This is called an undirected graph In undirected graphs, an edge from (A,B) means that there must be an edge from (B,A) In implementation, both of these edges need to be present, but if you indicate that a graph is undirected, you do not need to indicate both in your list of edges

Graphs C B A D Is this graph a tree?

Graphs Is this graph a tree? C A D B Is this graph a tree? Yes, you can make the shape by moving vertices around

Graphs C A D B Is this graph a tree?

Graphs Is this graph a tree? C A D B Is this graph a tree? No, there is no way to rearrange these vertices because there is a cycle

Graphs A path is a set of edges which goes from one vertex to another in a graph A cycle is a path that starts and ends on the same vertex. A graph is acyclic if no cycles exist in the graph

Graphs C A D B What is the cycle here?

Graphs C A D B What is the cycle here? (A,D) (D,C) (C,A)

Graphs In paths and cycles, the sets of edges must pass from one vertex to another, i.e. each edge must share a vertex with some other edge. (A,B) (B,C) is a path from A to C, while (A,B) (D,C) is not. There is no way to get from B to D

Graphs Trees are acyclic graphs Graphs can be traversed Breadth-first Depth-first Edges can also have weights Path finding on a map Route-optimization problems

Next CLass More graphs! Discuss implementation approaches Prepare for path finding problem for next week