Presentation is loading. Please wait.

Presentation is loading. Please wait.

12/8/2015CS135601 Introduction to Information Engineering1 Data Structure 12/8/2015 Che-Rung Lee.

Similar presentations


Presentation on theme: "12/8/2015CS135601 Introduction to Information Engineering1 Data Structure 12/8/2015 Che-Rung Lee."— Presentation transcript:

1 12/8/2015CS135601 Introduction to Information Engineering1 Data Structure 12/8/2015 Che-Rung Lee

2 12/8/2015CS135601 Introduction to Information Engineering2 Data abstraction Main memory is organized as a sequence of addressable cells, but the data we want to model is usually not. Use “model” and “simulation”

3 12/8/2015CS135601 Introduction to Information Engineering3 Pointers What is a pointer? –A special data that records memory address Example in C int a = 3; int *p = NULL; p = &a; *p = 5; variableaddressvalue a0x03 p0x04 3 0 0x03 5

4 12/8/2015CS135601 Introduction to Information Engineering4 Outline Customized data type Array and list Stack and queue Trees Hash table

5 12/8/2015CS135601 Introduction to Information Engineering5 Customized Data Type

6 12/8/2015CS135601 Introduction to Information Engineering6 How to model a warrior? Class Skills Equipments Life point Magic point Money … But computers only have primitive data types: integer, real, character, and Boolean. Diablo III

7 12/8/2015CS135601 Introduction to Information Engineering7 User-defined data types Conglomerate of primitive data types collected under a single name Example in C: struct typedef struct { char class[10]; // Barbarian, Witch, Wizard or Monk int lifePoint; // min is 0, max is 100 int level; // min is 1, max is 72 … } Warrior; Warrior player1; player1.lifePoint = 100; User-defined data type An instance of type Warrior

8 12/8/2015CS135601 Introduction to Information Engineering8 Abstract data type A full model of abstract data type should include the operations of the model –Like +-*/, input, output for primitive data types Example in C++: class –This is called an object, which we will talk more in the programming language lesson. class Warrior { char class[10]; // Barbarian, Witch, Wizard or Monk … void fight(….); // function that defines the action “fight” };

9 12/8/2015CS135601 Introduction to Information Engineering9 Heterogeneous array The storage that contains different types of data is called a heterogeneous array –struct and class are heterogeneous arrays –The items are called components. –The storage that contains the same type of data is called a homogeneous array Example struct { char Name[25]; int Age; int SkillRating;} Employee;

10 12/8/2015CS135601 Introduction to Information Engineering10 Storage of heterogeneous array Static method: –components are stored one after the other in a contiguous block Dynamic method: –components are stored in separate locations identified by pointers Meredith W Linsmeyer 23 6.2 pointers

11 12/8/2015CS135601 Introduction to Information Engineering11 Array and List

12 12/8/2015CS135601 Introduction to Information Engineering12 When to use arrays? Stock prices, student names, temperature readings –One dimensional array Matrix, images, the grades of class, train schedule –Two dimensional array Computed Tomography( 斷層掃描 ) –Three dimensional array

13 12/8/2015CS135601 Introduction to Information Engineering13 Storing arrays Use a variable to denote the address of the first element –Ex: int Readings[24]; Relative address called “index” In C, the index starts from 0 0 1 2 3

14 12/8/2015CS135601 Introduction to Information Engineering14 Two dimensional array Two dimensional array is stored in a one dimensional memory cells. Two ways to order the data –What is the memory location of A[2][3] in the row (column) major order? a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 a 41 a 42 a 43 a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 a 41 a 42 a 43 a 11 a 21 a 31 a 41 a 12 a 22 a 32 a 42 a 13 a 23 a 33 a 43 Row major order Column major order row column

15 12/8/2015CS135601 Introduction to Information Engineering15 High dimensional array Consider the dimensional array A[m][n][k] –What is the size of the array? –What is the memory location of A[1][2][3] in the row major order? The row major order –What is the memory location of A[1][2][3] in the column major order? The row major order This changes first

16 12/8/2015CS135601 Introduction to Information Engineering16 When to use list? List is a collection of data which are arranged sequentially. –One dimensional array is a list of elements –Two dimensional array can be viewed as a list of rows/columns –A string is a list of characters –Music is a list of sounds –Stacks and queues can be implemented using lists We will talk those later

17 12/8/2015CS135601 Introduction to Information Engineering17 Contiguous list List is stored in a contiguous block of memory cells (an array) –Ex: list of names. Each name is occupied 8 bytes.

18 12/8/2015CS135601 Introduction to Information Engineering18 Linked list List in which each entries are linked by pointers –Head pointer: Pointer to first entry in list –NIL pointer: A “non-pointer” value used to indicate end of list Use customized data type to define

19 12/8/2015CS135601 Introduction to Information Engineering19 Static v.s. dynamic data structures Static data structures: –Size and shape does not change –Contiguous list –Easily to locate elements. No need to store address. Dynamic data structures: –Size and shape can change –Linked list –Easily to delete/insert elements

20 12/8/2015CS135601 Introduction to Information Engineering20 Linked list: delete/insert element Delete Insert

21 12/8/2015CS135601 Introduction to Information Engineering21 Stack and Queue

22 12/8/2015CS135601 Introduction to Information Engineering22 What is a stack? A list in which entries are removed and inserted only at the head –Top: The head of stack –Bottom or base: The tail of stack –Push: To insert an entry at the top –Pop: To remove the entry at the top –LIFO: Last-in-first-out bottom top

23 12/8/2015CS135601 Introduction to Information Engineering23 When to use stacks? When the algorithm needs data LIFO? –EX1: reverse a word, ABC  CBA Push A Push B Push C –EX2: check matching parentheses (3*[(1+1)*2] Push “(“ Push “[“ Push “(“ Pop C Pop B Pop A Find “)”, pop “(“, matched Find “]”, pop “[“, matched No more “)”, but still one “(“ in stack, not matched ABC

24 12/8/2015CS135601 Introduction to Information Engineering24 Stack implementation Using a list + a pointer (head)

25 12/8/2015CS135601 Introduction to Information Engineering25 Queue A list in which entries are removed at the head and are inserted at the tail. –Enqueue: insert an entry at the tail –Dequeue: remove an entry at the head –FIFO: First-in-first-out HeadTail

26 12/8/2015CS135601 Introduction to Information Engineering26 Examples of using queues Ex1: the job queues in operating system Ex2: simulation of the Josephus problem –Dequeue 1 –Enqueue 1 –Dequeue 2 –Dequeue 3 –Enqueue 3 654321 Operation counts  2n

27 12/8/2015CS135601 Introduction to Information Engineering27 Queue implementation A list + 2 pointers (head+tail) –Enqueue A, B, C –Dequeue A, enqueue D –Dequeue B, enqueue E If using a static list, the queue crawls through memory as entities are inserted and removed. Head pointerTail pointer A B C D E

28 12/8/2015CS135601 Introduction to Information Engineering28 Circular queue A technique that uses a fixed region of memory space to implement queue. tail head A B C D Enqueue A, B, C Dequeue A, Enqueue D Dequeue B, Enqueue E E

29 12/8/2015CS135601 Introduction to Information Engineering29 Trees

30 12/8/2015CS135601 Introduction to Information Engineering30 What is a tree? A collection of nodes that are linked in a hierarchical structure, in which every node is linked by one parent, except the root. –Node: An entry in a tree –Parent: The node immediately above a specified node –Root: The node at the top –Terminal or leaf node: A node at the bottom

31 12/8/2015CS135601 Introduction to Information Engineering31 Hierarchical relations Parent: The node immediately above a node –The parent of F is B Child: A node immediately below a node –The children of C are G and H. Ancestor: Parent, parent of parent, etc. –The ancestor of K are F, B, and A. Descendent: Child, child of child, etc. –The descendent of B are E, F, K, and L. Siblings: Nodes sharing a common parent –The siblings of C are B and D. A BCD EFGHIJ KL

32 12/8/2015CS135601 Introduction to Information Engineering32 Depth and height Textbook’s definition –The depth of a tree is the longest path from the root to a leaf node The length of a path is the number of nodes on the path Ex: the depth of the tree is 4 Conventional definition Use the word “height” instead of depth The length of a path is the number of links on the path Ex: The height of the tree is 3 (= 4 – 1) A BCD EFGHIJ KL

33 12/8/2015CS135601 Introduction to Information Engineering33 What are trees used for? Representing hierarchical data –Organization chart Searching data –Game tree

34 12/8/2015CS135601 Introduction to Information Engineering34 A tree in which each parent has at most two children Left subtree Right subtree Binary tree Left childRight child

35 12/8/2015CS135601 Introduction to Information Engineering35 Storing a binary tree in a list This is called a heap in some applications.

36 12/8/2015CS135601 Introduction to Information Engineering36 Advantages of using heap Easily to find the index of parent & children –Parent(B) = [index of B] / 2 = 1 –LeftChild(B) = [index of B]*2 = 4 –RightChild(B) = [index of B]*2 + 1= 5

37 12/8/2015CS135601 Introduction to Information Engineering37 Problems for heap Heap is inefficient for storing the binary tree that is sparse and unbalanced –Sparse: most node has one or zero child –Unbalanced: the right subtree is much larger than the left subtree, or vice versa

38 12/8/2015CS135601 Introduction to Information Engineering38 Storing a binary tree using pointers Each node Use customized data type to define

39 12/8/2015CS135601 Introduction to Information Engineering39 Recursive structure Tree is a recursive structure –The subtrees of a tree are trees The recursive algorithms for a binary tree may look like this –It is a depth first, in order algorithm for tree procedure some_operation (root) if (root is not NULL) then ( call some_operation(root.left_child) do some operations on root call some_operation(root.right_child))

40 12/8/2015CS135601 Introduction to Information Engineering40 Hash Table

41 12/8/2015CS135601 Introduction to Information Engineering41 Search Search is a common task in daily life –Phone book: given a name, fine the phone number –Dictionary: given a word, find it’s definition –Map: given an address, find the location or route –DNS: given an URL, find it’s IP address Tree can be used to speedup searches. –How? And what is the operation count?

42 12/8/2015CS135601 Introduction to Information Engineering42 Constant time search Something can be found in constant time –EX: What is fifth element of the array A? A[4] An array is like a lookup table. One can use the index to query and get the value Can we use this idea to organize data so that searches can be done in the constant time? –Hash table (or hash map)

43 12/8/2015CS135601 Introduction to Information Engineering43 Hash table Each record of data has a key field –Key is like the index of an array. –An unique identification of the data (ideally) The storage space is divided into buckets –Each bucket is like an array cell. –Each record is stored in the bucket corresponding to its key, so it can be retrieved in constant time

44 12/8/2015CS135601 Introduction to Information Engineering44 How to define the mapping? Unique identification of a record is usually too large to be the index for storage –For example, the ASCII code for a string We do not want to create such a large array!!

45 12/8/2015CS135601 Introduction to Information Engineering45 Hash function A hash function computes a bucket number for each key value –EX: suppose there are only 41 buckets.

46 12/8/2015CS135601 Introduction to Information Engineering46 Problem Collision: The case of two or more keys hashing to the same bucket –Major problem when table is over 75% full

47 12/8/2015CS135601 Introduction to Information Engineering47 Solutions Use linked lists to store collided data –The search time becomes linear to the number of collided data Increase the number of buckets and rehash all data –Time/space tradeoff Design a better hash function/algorithm –It’s a research problem

48 12/8/2015CS135601 Introduction to Information Engineering48 References Textbook 8.1, 8.2, 8.3, 8.5, 9.5 Wikipedia Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, “Introduction to Algorithms” 資料結構,演算法,程式語言 Related courses


Download ppt "12/8/2015CS135601 Introduction to Information Engineering1 Data Structure 12/8/2015 Che-Rung Lee."

Similar presentations


Ads by Google