Presentation is loading. Please wait.

Presentation is loading. Please wait.

Final Exam Review CS 3610/5610N Dr. Jundong Liu.

Similar presentations


Presentation on theme: "Final Exam Review CS 3610/5610N Dr. Jundong Liu."— Presentation transcript:

1 Final Exam Review CS 3610/5610N Dr. Jundong Liu

2 About the final exam Coverage: after midterm + heap operations
Time: Thursday, April 30, 8-10am Preparation: focus on lecture notes & projects & homework

3 Structure of the exam True/False, fill the blanks (~15%)
Algorithms’ properties (~20%) Operations on certain inputs (~30%) Proof of certain claim (~10%) Code analysis (10-15%) Code writing (10-15%) Other

4 Search & Hashing: Objectives
Learn the various search algorithms Explore how to implement the sequential and binary search algorithms Discover how the sequential and binary search algorithms perform Become aware of the lower bound on comparison- based search algorithms Learn about hashing 4 4

5 Sequential Search Analysis
Sequential search algorithm performance Examine worst case and average case Count number of key comparisons Unsuccessful search Search item not in list Make n comparisons Conducting algorithm performance analysis Best case: make one key comparison Worst case: algorithm makes n comparisons 5

6 Sequential Search Analysis (cont’d.)
Determining the average number of comparisons (cont’d.) 6

7 Binary Search Performed only on ordered lists
Uses divide-and-conquer technique FIGURE 9-1 List of length 12 FIGURE 9-2 Search list, list[0]...list[11] FIGURE 9-3 Search list, list[6]...list[11] 7

8 Binary Search: Analysis
Worst case complexity? Each level in the recursion, we split the array in half (divide by two). Therefore maximum recursion depth is floor(log2n) and worst case = O(log2n). Average case is also = O(log2n). Data Structures Using C++ 2E

9 Lower Bound on Comparison-Based Search Algorithms
Worst-case complexity 9

10 Hashing Algorithm of order one (on average)
Requires data to be specially organized Hash table Helps organize data Stored in an array Denoted by HT Hash function Arithmetic function denoted by h Applied to key X Compute h(X): read as h of X h(X) gives address of the item 10

11 Hashing (cont’d.) Synonym Overflow: Occurs if bucket t full
Collision: Occurs if h(X1) = h(X2) Overflow and collision occur at same time If r = 1 (bucket size = one) Data Structures Using C++ 2E 11

12 Hashing: two issues Choosing a hash function Main objectives
Choose an easy to compute hash function Minimize number of collisions Handle overflow Data Structures Using C++ 2E 12

13 Collision Resolution Desirable to minimize number of collisions
Collisions unavoidable in reality Hash function always maps a larger domain onto a smaller range Collision resolution technique categories Open addressing (closed hashing) Data stored within the hash table Chaining (open hashing) Data organized in linked lists Hash table: array of pointers to the linked lists 13

14 Linear Probing Starting at location t Assume circular array
Search array sequentially to find next available slot Assume circular array If lower portion of array full Can continue search in top portion of array using mod operator Starting at t, check array locations using probe sequence t, (t + 1) % HTSize, (t + 2) % HTSize, . . ., (t + j) % HTSize 14

15 Linear Probing (cont’d.)
Improving linear probing Skip array positions by fixed constant (c) instead of one Random probing Re-hashing Data Structures Using C++ 2E 15

16 Quadratic Probing Suppose Starting at position t
Item with key X hashed at t (h(X) = t and 0 <= t <= HTSize – 1) Position t already occupied Starting at position t Linearly search array at locations (t + 1)% HTSize, (t ) % HTSize = (t + 4) %HTSize, (t + 32) % HTSize = (t + 9) % HTSize, . . ., (t + i2) % HTSize Probe sequence: t, (t + 1) % HTSize (t + 22 ) % HTSize, (t + 32) % HTSize, . . ., (t + i2) % HTSize 16

17 Quadratic Probing (cont’d.)
Reduces primary clustering Does not probe all positions in the table But the first b/2 probes, including the initial location h(k), all end up with distinct and unique locations After that, probing locations may repeat As a result: there is no guaranteed of finding an empty cell once the table gets more than half full Considerable number of probes Assume full table Stop insertion (and search) Data Structures Using C++ 2E 17

18 Quadratic Probing (cont’d.)
Primary clustering Secondary clustering Data Structures Using C++ 2E 18

19 Linear open addressing (linear probing): search, insert and delete

20 Collision Resolution: Chaining (Open Hashing)
Hash table HT: array of pointers For each j, where 0 <= j <= HTsize -1 HT[j] is a pointer to a linked list FIGURE 9-10 Linked hash table Data Structures Using C++ 2E 20

21 Collision Resolutions
Advantages of chaining in comparison with quadratic probing. Disadvantage of chaining Small item size wastes space Data Structures Using C++ 2E 21

22 Selection Sort: Array-Based Lists
Selection sort operation Find location of the smallest element in unsorted list portion Move it to top of unsorted portion of the list First time: locate smallest item in the entire list Second time: locate smallest item in the list starting from the second element in the list, and so on Data Structures Using C++ 2E 22

23 FIGURE 10-1 List of 8 elements
FIGURE 10-2 Elements of list during the first iteration FIGURE 10-3 Elements of list during the second iteration Data Structures Using C++ 2E 23

24 Analysis: Selection Sort
Search algorithms Concerned with number of key (item) comparisons Sorting algorithms Concerned with number of key comparisons and number of data movements Analysis of selection sort Function swap Number of item assignments: 3(n-1) Function minLocation Number of key comparisons of O(n2) 24

25 Insertion Sort Attempts to improve high selection sort key comparisons
Sorts list by moving each element to its proper place Given list of length eight FIGURE 10-4 list Data Structures Using C++ 2E 25

26 Insertion Sort: Insert
Three strategies to find proper place Search from rear (using arrays as in the book) Search from front (using linked lists as in the book) Binary search Data Structures Using C++ 2E 26

27 Insertion Sort: Array-Based Lists
Elements list[0], list[1], list[2], list[3] in order Consider element list[4] First element of unsorted list FIGURE 10-5 list elements while moving list[4] to its proper place 27

28 Insertion Sort: Linked List-Based Lists
If list stored in an array Traverse list in either direction using index variable If list stored in a linked list Traverse list in only one direction Starting at first node: links only in one direction FIGURE Linked list 28

29 Insertion Sort: Best case and worst
Best case: sorted array Search from rear: (n-1) comparisons and 0 data movement Worst case: reversely sorted array Search from rear: (n-1) + (n-2) + (n-3) + … + 1 comparisons and movements Data Structures Using C++ 2E 29

30 Shellsort Take advantage of the best case of insertion sort Use global jumps to make the input quickly close to an almost-sorted situation Jumps are controlled by step sizes e.g. 30, 13, 5, 3, 1 The final step will be an insertion sort, where the input is almost sorted. Data Structures Using C++ 2E 30

31 Shellsort (cont’d.) FIGURE 10-19 Lists during Shellsort
Data Structures Using C++ 2E 31

32 Quicksort: Array-Based Lists
Data Structures Using C++ 2E

33 QuickSort: implementation issues
How to choose the pivot element each time? After the pivot is decided, how to move the elements so that the array is separated into two sub-arrays? Time complexity: what are the determining factors to produce base and worst performance? Data Structures Using C++ 2E

34 Quicksort: choose the Pivot
Determine the pivot: many different approaches the first element of the current sub-array the last element the middle element (textbook version) the median of first, middle, and last elements. Randomly choose an element In this textbook, middle element is chosen as the pivot, and then swapped with the first element.

35 Quicksort: element movements in the Partition procedure
Again, many different solutions. Commonality: maintain three array segments: The elements smaller than the pivot The elements bigger than the pivot The elements to be explored Difference: how to maintain these three segments. In this textbook, the areas are kept as: pivot | smaller elements | bigger | unexplored

36 QuickSort The divide, conquer and combine steps for QuickSort.
What’s the complexity of the “Partition” routine? What’re the worst-case and best-case complexities, and when do they happen (depending on the choice of pivot)? What’s the average-case complexity?

37 MergeSort (recursive version)

38 Complexity of MergeSort

39 MergeSort, Cont’d Recursive version Iterative version (basic idea)
Using arrays Using linked list Iterative version (basic idea)

40 Heap Complete binary tree + heap property Routines
Heapify O(lgn) Build-Heap O(n) Heap-Sort O(nlgn) Implement Priority Queue Maximum O(1) Extract-Max O(lgn) Insert O(lgn) Increase-Key O(lgn)

41 Comparison-based sort

42 Decision tree model

43 Lower bound of comparison-based sort

44 Graph Theory: Objectives
Learn about graphs Become familiar with the basic terminology of graph theory Discover how to represent graphs in computer memory 44 44

45 Graph Theory: Objectives (cont’d.)
Examine and implement various graph traversal algorithms Learn how to implement a shortest path algorithm Examine and implement the minimum spanning tree algorithm 45 45

46 Graph Definitions and Notations
Graph G pair G = (V, E), where V is a finite nonempty set Called the set of vertices of G, and E  V x V E: set of edges of G G called trivial if it has only one vertex Directed graph (digraph) Elements in set of edges of graph G: ordered Undirected graph: not ordered Data Structures Using C++ 2E 46

47 FIGURE 12-3 Various undirected graphs
FIGURE 12-4 Various directed graphs Data Structures Using C++ 2E 47

48 Graph Definitions and Notations (cont’d.)
Undirected graph: edges drawn using lines Directed graph: edges drawn using arrows u and v adjacent, if… Definition of Loop Edge incident on a single vertex e1 and e2 called parallel edge, if… Simple graph No loops, no parallel edges

49 Graph Definitions and Notations (cont’d.)
Undirected graph G is connected If path from any vertex to any other vertex exists Component of G Maximal subset of connected vertices Directed graph G is strongly connected If any two vertices in G are connected Data Structures Using C++ 2E 49

50 Graph Representation Graphs represented in computer memory
Two common ways Adjacency matrices Adjacency lists Data Structures Using C++ 2E 50

51 Adjacency Matrices Let G be a graph with n vertices where n > zero
Let V(G) = {v1, v2, ..., vn} Adjacency matrix Data Structures Using C++ 2E 51

52 Adjacency Lists (cont’d.)
FIGURE 12-6 Adjacency list of graph G3 of Figure 12-4 FIGURE 12-5 Adjacency list of graph G2 of Figure 12-4 Data Structures Using C++ 2E 52

53 Operations on Graphs Commonly performed operations Create graph
Store graph in computer memory using a particular graph representation Clear graph Makes graph empty Determine if graph is empty Traverse graph Print graph 53

54 Graph Traversals vs. Binary tree traversals
Two most common graph traversal algorithms Depth first traversal in graphs  pre-order traversal in binary tree (parent, left sub-tree, right sub-tree) Breadth first traversal in graphs  level-order traversal in binary tree (parent, children, grandchildren..) Data Structures Using C++ 2E 54

55 Depth First Traversal Similar to binary tree preorder traversal
FIGURE 12-7 Directed graph G3 55

56 Depth First Traversal (cont’d.)
General algorithm for depth first traversal at a given node v Recursive algorithm 56

57 Breadth First Traversal
Similar to traversing binary tree level-by-level (called level-order traversal) Nodes at each level Visited from left to right All nodes at any level i Visited before visiting nodes at level i + one

58 Breadth First Traversal (cont’d.)
General search algorithm Breadth first search algorithm with a queue

59 Connected components (CC) and spanning trees (ST)
Concept of CC and ST DFT and BFT can be used to retrieve connected components. DFT and BFT can be used to generate spanning trees; how do the resulting STs look like?

60 Minimum spanning tree Prim’s algorithm
Keep the set T as a single tree; grow it into a MST. Each step, find the shortest edge connecting T and NON-T and include it into the set T. Understand the procedure

61 Shortest path problem Single-pair shortest path
Single-source shortest paths Dijkstra’s algorithm All-pairs shortest paths Impact of negative edges

62 Dijkstra’s algorithm What is the input constraint for Dijkstra’s algorithm? Why is it necessary? Understand the procedure Shortest distances vs. shortest paths (project 6)

63 About the exam Coverage: after midterm + heap operations
Time: Thursday, next week, 8-10am Preparation: focus on lecture notes & & projects & homework Any questions?


Download ppt "Final Exam Review CS 3610/5610N Dr. Jundong Liu."

Similar presentations


Ads by Google