Presentation is loading. Please wait.

Presentation is loading. Please wait.

More on Data Structures in C CS-2301 B-term 20081 More on Lists and Trees Introduction to Hash Tables CS-2301, System Programming for Non-majors (Slides.

Similar presentations


Presentation on theme: "More on Data Structures in C CS-2301 B-term 20081 More on Lists and Trees Introduction to Hash Tables CS-2301, System Programming for Non-majors (Slides."— Presentation transcript:

1 More on Data Structures in C CS-2301 B-term 20081 More on Lists and Trees Introduction to Hash Tables CS-2301, System Programming for Non-majors (Slides include materials from The C Programming Language, 2 nd ed., by Kernighan and Ritchie and from C: How to Program, 5 th ed., by Deitel and Deitel)

2 More on Data Structures in C CS-2301 B-term 20082 Linked List (review) Linear data structure Easy to grow and shrink Easy to add and delete items Time to search for an item – O(n)

3 More on Data Structures in C CS-2301 B-term 20083 Linked List (review) Linear data structure Easy to grow and shrink Easy to add and delete items Time to search for an item – O(n) “Big-O” notation:– means “order of”

4 More on Data Structures in C CS-2301 B-term 20084 Definition — Big-O “Of the order of …” A characterization of the number of operations in an algorithm in terms of the number of data items involved O(n) means that the number of operations to complete the algorithm is proportional to n E.g., searching a list with n items requires, on average, n/2 comparisons with payloads

5 More on Data Structures in C CS-2301 B-term 20085 Big-O (continued) O(n): proportional to n – i.e., linear O(n 2 ): proportional to n 2 – i.e., quadratic O(k n ) – proportional to k n – i.e., exponential … O(log n) – proportional to log n – i.e., sublinear O(n log n) Worse than O(n), better than O(n 2 ) O(1) – independent of n; i.e., constant

6 More on Data Structures in C CS-2301 B-term 20086 Anecdote & Questions:– In the design of electronic adders, what is the order of the carry-propagation? What is the order of floating point divide? What is the order of floating point square root? What program have we studied in this course that is O(2 n )? i.e., exponential?

7 More on Data Structures in C CS-2301 B-term 20087 Questions on Big-O?

8 More on Data Structures in C CS-2301 B-term 20088 Back to Linked List Review Linear data structure Easy to grow and shrink Easy to add and delete items Time to search for an item – O(n)

9 More on Data Structures in C CS-2301 B-term 20089 Linked List (continued) payload next payload next payload next payload next struct listItem *head;

10 More on Data Structures in C CS-2301 B-term 200810 Doubly-Linked List (review) prevnext payload prevnext payload prevnext payload prevnext payload struct listItem *head, *tail;

11 More on Data Structures in C CS-2301 B-term 200811 AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; new -> prev = p->next->prev; p -> next = p->next->prev = new; }

12 More on Data Structures in C CS-2301 B-term 200812 AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; new -> prev = p; p -> next -> prev = new; p -> next = new; } prevnext payload prevnext payload prevnext payload

13 More on Data Structures in C CS-2301 B-term 200813 AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; new -> prev = p; p -> next -> prev = new; p -> next = new; } prevnext payload prevnext payload prevnext payload

14 More on Data Structures in C CS-2301 B-term 200814 AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; new -> prev = p; p -> next -> prev = new; p -> next = new; } prevnext payload prevnext payload prevnext payload

15 More on Data Structures in C CS-2301 B-term 200815 AddAfter(item *p, item *new) Simple linked list {new -> next = p -> next; p -> next = new; } Doubly-linked list {new -> next = p -> next; new -> prev = p; p -> next -> prev = new; p -> next = new; } prevnext payload prevnext payload prevnext payload

16 More on Data Structures in C CS-2301 B-term 200816 deleteNext(item *p) Simple linked list {if (p->next != NULL) p->next = p->next-> next; } Doubly-linked list Complicated Easier to deleteItem

17 More on Data Structures in C CS-2301 B-term 200817 deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload

18 More on Data Structures in C CS-2301 B-term 200818 deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload

19 More on Data Structures in C CS-2301 B-term 200819 deleteItem(item *p) Simple linked list Not possible without having a pointer to previous item! Doubly-linked list {if(p->next != NULL) p->next->prev = p->prev; if(p->prev != NULL) p->prev->next = p->next; } prevnext payload prevnext payload prevnext payload

20 More on Data Structures in C CS-2301 B-term 200820 Special Cases of Linked Lists Queue:– –Items always added to tail –Items always removed from head Stack:– –Items always added to head –Items always removed from head

21 More on Data Structures in C CS-2301 B-term 200821 Bubble Sort a Linked List item *BubbleSort(item *p) { if (p->next != NULL) { item *q = p->next, *qq = p; for (;q != NULL; qq = q, q = q->next) if (p->payload > q->payload){ /*swap p and q */ } p->next = BubbleSort(p->next); }; return p; }

22 More on Data Structures in C CS-2301 B-term 200822 Bubble Sort a Linked List item *BubbleSort(item *p) { if (p->next != NULL) { item *q = p->next, *qq = p; for (;q != NULL; qq = q, q = q->next) if (p->payload > q->payload){ item *temp = p->next; p->next = q->next; q->next = temp; qq->next = p; p = q; } p->next = BubbleSort(p->next); }; return p; }

23 More on Data Structures in C CS-2301 B-term 200823 Potential Exam Questions Analyze BubbleSort to determine if it is correct, and fix it if incorrect. Hint: you need to define “correct” Hint2: you need to define a loop invariant to convince yourself Draw a diagram showing the nodes, pointers, and actions of the algorithm

24 More on Data Structures in C CS-2301 B-term 200824 Observations:– What is the order of the Bubble Sort algorithm? Answer: O(n 2 ) Note that Quicksort is faster Pages 87 & 110 in Kernighan and Ritchie Potential exam question:– why?

25 More on Data Structures in C CS-2301 B-term 200825 Questions?

26 More on Data Structures in C CS-2301 B-term 200826 Binary Tree (review) A linked list but with two links per item struct treeItem { type payload; treeItem *left; treeItem *right; }; leftright payload leftright payload leftright payload leftright payload leftright payload leftright payload leftright payload

27 More on Data Structures in C CS-2301 B-term 200827 Binary Trees (continued) Two-dimensional data structure Easy to grow and shrink Easy to add and delete items at leaves More work needed to insert or delete branch nodes Search time is O(log n) If tree is reasonably balanced Degenerates to O(n) in worst case if unbalanced

28 More on Data Structures in C CS-2301 B-term 200828 Order of Traversing Binary Trees In-order Traverse left sub-tree (in-order) Visit node itself Traverse right sub-tree (in-order) Pre-order Visit node first Traverse left sub-tree Traverse right sub-tree Post-order Traverse left sub-tree Traverse right sub-tree Visit node last

29 More on Data Structures in C CS-2301 B-term 200829 Order of Traversing Binary Trees In-order Traverse left sub-tree (in-order) Visit node itself Traverse right sub-tree (in-order) Pre-order Visit node first Traverse left sub-tree Traverse right sub-tree Post-order Traverse left sub-tree Traverse right sub-tree Visit node last Homework #5

30 More on Data Structures in C CS-2301 B-term 200830 Example of Binary Tree x = (a.real*b.imag - b.real*a.imag) / sqrt(a.real*b.real – a.imag*b.imag) = x/ sqrt - **.. arealbimag.. brealaimag - …

31 More on Data Structures in C CS-2301 B-term 200831 Question What kind of traversal order is required for this expression? In-order? Pre-order? Post-order?

32 More on Data Structures in C CS-2301 B-term 200832 Binary Trees in Compilers Used to represent the structure of the compiled program Optimizations Common sub-expression detection Code simplification Loop unrolling Parallelization Reductions in strength – e.g., substituting additions for multiplications, etc. Many others

33 More on Data Structures in C CS-2301 B-term 200833 Questions about Trees? (or about Homework 5?)

34 More on Data Structures in C CS-2301 B-term 200834 New Challenge What if we have a data structure that needs to be accessed by value in constant time? I.e., O(log n) is not good enough! Need to be able to add or delete items Total number of items unknown But an approximate maximum might be known

35 More on Data Structures in C CS-2301 B-term 200835 Examples Anti-virus scanner Symbol table of compiler Virtual memory tables in operating system Bank account for an individual

36 More on Data Structures in C CS-2301 B-term 200836 Observation Arrays provide constant time access … … but you have to know which element you want! Also Not easy to grow or shrink Not open-ended Can we do better?

37 More on Data Structures in C CS-2301 B-term 200837 Answer – Hash Table Definition:– Hash Table A data structure comprising an array (for constant time access) A set of linked lists (for each array element) A hashing function to convert value to array index Definition:– Hashing function (or simply hash function) A function that takes the value in question and “randomizes” it to produce an index So that non-randomness of values does not cause concentration of too many elements around a few indices in array See §6.6 in Kernighan & Ritchie

38 More on Data Structures in C CS-2301 B-term 200838 data next Hash Table Structure item... data next data next data next data next data next data next data next data next data next data next data next data next

39 More on Data Structures in C CS-2301 B-term 200839 Guidelines for Hash Tables Lists from each item should be short I.e., with short search time (approximately constant) Size of array should be based on expected # of entries Err on large side if possible Hashing function Should “spread out” the values relatively uniformly Multiplication and division by prime numbers usually works well

40 More on Data Structures in C CS-2301 B-term 200840 Example Hashing Function P. 144 of K & R #define HASHSIZE 101 unsigned int hash(char *s) { unsigned int hashval; for (hashval = 0; *s != ‘\0’; s++) hashval = *s + 31 * hashval; return hashval % HASHSIZE }

41 More on Data Structures in C CS-2301 B-term 200841 Example Hashing Function P. 144 of K & R #define HASHSIZE 101 unsigned int hash(char *s) { unsigned int hashval; for (hashval = 0; *s != ‘\0’; s++) hashval = *s + 31 * hashval; return hashval % HASHSIZE } Note choice of prime numbers to “mix it up”

42 More on Data Structures in C CS-2301 B-term 200842 Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ }

43 More on Data Structures in C CS-2301 B-term 200843 Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ } Hash table is indexed by hash value of s

44 More on Data Structures in C CS-2301 B-term 200844 Using a Hash Table struct item *lookup(char *s) { struct item *np; for (np = hashtab[hash(s)]; np != NULL; np = np -> next) if (strcmp(s, np->data) == 0) return np; /*found*/ return NULL;/* not found */ } Traverse the linked list to find item s

45 More on Data Structures in C CS-2301 B-term 200845 Using a Hash Table (continued) struct item *addItem(char *s, …) { struct item *np; unsigned int hv; if ((np = lookup(s)) == NULL) { np = malloc(item); /* fill in s and data */ np -> next = hashtab[hv = hash(s)]; hashtab[hv] = np; }; return np; }

46 More on Data Structures in C CS-2301 B-term 200846 Using a Hash Table (continued) struct item *addItem(char *s, …) { struct item *np; unsigned int hv; if ((np = lookup(s)) == NULL) { np = malloc(item); /* fill in s and data */ np -> next = hashtab[hv = hash(s)]; hashtab[hv] = np; }; return np; } Inserts new item at head of the list indexed by hash value

47 More on Data Structures in C CS-2301 B-term 200847 Hash Table Summary Widely used for constant time access Easy to build and maintain There exist an art and science to the choice of hashing functions Consult textbooks, web, etc.

48 More on Data Structures in C CS-2301 B-term 200848 Questions?


Download ppt "More on Data Structures in C CS-2301 B-term 20081 More on Lists and Trees Introduction to Hash Tables CS-2301, System Programming for Non-majors (Slides."

Similar presentations


Ads by Google