CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.

CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 206 - Fall 2009 Today’s Topics Questions? Comments? Graphs –Shortest Path algorithms Dijkstra's Hash tables

Graphs the shortest path could be in terms of path length (number of edges between vertices) –Initialize all lengths to infinity –Process the graph in a BFS order starting at the given vertex –but when visit a node, also replace its length with the current length. This is just BFS while also keeping track of path lengths. Let's code this now.

Graphs Recall the implementation of breadth first search: –Keep a queue which will hold the vertices to be visited –Keep a list of vertices as they are visited –BFS algorithm: Mark all vertices as unvisited Initially enqueue a vertex into the queue, mark it as waiting While the queue is not empty –Dequeue a vertex from the queue –Put it in the visited list, mark it as visited –Enqueue all the adjacent vertices that are marked as unvisited to the vertex just dequeued. –Mark the vertices just enqueued as waiting

Graphs Example on the board and then pseudocode for Dijkstra's algorithm. vi means vertex i, and (j) means weight j v0-> v1(2), v3(1)‏ v1-> v3(3), v4(10)‏ v2-> v0(4), v5(5)‏ v3-> v2(2), v4(2), v5(8), v6(4)‏ v4-> v6(6)‏ v5-> null v6-> v5(1)‏ Dijkstra's algorithm, given a starting vertex will find the minimum weight paths from that starting vertex to all other vertices.

We can reuse the code we wrote to set vertices as visited or unvisited. We can reuse our code to handle a directed graph, but we must add the ability for it to be a weighted graph. We need a “minimum” Priority Queue, that is, one that returns the item with the lowest priority at any given dequeue(). We need a way to store all the path lengths. Also, we need to initially set all the minimum path lengths to Integer.MAX_VALUE (this is the initial value we want to use for the path lengths, because if we ever calculate a lesser weight path, then we store this lesser weight path.)‏

Dijkstra's algorithm pseudocode (given a startV)‏ set all vertices to unvisited and all to have pathLen MAX set pathLen from startV to startV to be 0 add (item=startV, priority=0) to PQ while (PQ !empty) { v = remove the lowest priority vertex from PQ (do this until we get an unvisited vertex out)‏ set v to visited for all unvisited adjacent vertices (adjV) to v { if ( current pathLen from startV to adjV ) > ( weight of the edge from v to adjV + pathLen from startV to v ) then { set adjV's pathLen from startV to adjV to be weight of the edge from v to adjV + pathLen from startV to v add (item=v, priority=pathLen just calculated) to PQ } // end if } // end for } // end while

What if we wanted to not only display the minimum weight path to each vertex, but also the actual path (with the intervening vertices)? e.g. Minimum weight from v0 to v5 is 8, and the path is: v0 to v3 to v2 to v5

What if we wanted to not only display the minimum weight path to each vertex, but also the actual path (with the intervening vertices)? Notice that the minimum path from v0 to v3 is: v0 to v3 with a weight of 1 Notice that the minimum path from v0 to v2 is: v0 to v3 to v2 with a weight of 3 Notice that the minimum path from v0 to v5 is: v0 to v3 to v2 to v5 with a weight of 8 See how all these just extend each other by one more edge. For example if we end up at v5 from v2, we get to v2 the same way we would have gotten to v2 (w/ the minimum weight.)‏

Hashing is used to allow very efficient insertion, removal, and retrieval of items. Consider retrieval (searching) with several structures –To find data in an unordered linear list structure O(n)‏ –To find data in an order linear list structure O(log n)‏ –To find data in a BST or a Heap O(log n)‏ What orders are better than log n ? Hashes

Hashing is used to allow –inserting an item –removing an item –searching for an item all in constant time (in the average case). Hashing does not provide efficient sorting nor efficient finding of the minimum or maximum item etc. Hashes

We want to insert our items (of any type (String, int, double, etc.)) into a structure that allows fast retrieval. Terms: –Hash Table (an array of references to objects(items))‏ table_size is the number of places to store –Hash Function (calculates a hash value (an integer) based on some key data about the item we are adding to the hash table.)‏ –Hash Value (the value returned by the hash function)‏ the hash value must be an integer value within [0, table_size – 1] this gives us the index in the hash table, where we wish to store the item. Hashes

Just to give an idea of how to insert and retrieve items into a hash table (this does not use a good hash function)‏ –Consider our items are simply ints –Consider our Hash Function to be f(x) = x % n (this is not a typical hash function)‏ –The hash function returns a hash value which is modded by the size of our hash table array to compute the index where we wish to store our item. –example on the board (assume n=8, add items 24, 3, 17, 31)‏ Then we can reverse the process to see if a particular item is in our hash table. Hashes

In our example (assume n=8, add items 24, 3, 17, 31), what if we needed to insert item 11 into our hash? There'd be a collision. There are several strategies to handle collisions –Assume the hash value computed was H –the chosen strategy effects how retrieval is handled too –Open Addressing (aka Probing hash table) Place item in next open slot (linear probing)‏ –H+1, or H+2 or H+3... Place item in next open slot (quadratic probing)‏ –H+1 2, or H+2 2, or H+3 2, or H+4 2,... –Wraparound is allowed Hashes

There are several strategies to handle collisions –Another technique besides Open Addressing, is Separate chaining Each array element stores a linked list Examples of these techniques on the board. Hashes

Typically we won't know our keys ahead of time (before we create our hash)‏ But if we did know all our keys ahead of time, would that help us in any way? Hashes

Let's come up with a hash table to store Strings –we'll need to come up with the size of our table –we'll need to decide whether we will use separate chaining or open addressing hashing –We'll need to create a hash function. (We'll talk about strategies for creating good hash functions next time)‏ We'll also allow insertion and retrieval (determine if an item exists in the hash). Hashes

Strategies for best performance –want items to be distributed evenly (uniformly) throughout the hash table and we want few (or no) collisions so that depends on our data items, our choice of hash function and the size of our hash table –also need to decide whether to use a probing (linear or quadratic) hash table or to use a hash table where collisions are handled by adding the item to a list for the index (hash value)‏ another method is called double hashing. –if choices are done well we get the retrieval time to be a constant, but the worst case is O(n)‏ –we also need to consider the computations needed to insert (computing the hash value)‏ Hashes

CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.

Similar presentations

Presentation on theme: "CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.

Similar presentations

Presentation on theme: "CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann."— Presentation transcript:

Similar presentations

About project

Feedback