# C. Y. Tang and J. S. Roger Jang

## Presentation on theme: "C. Y. Tang and J. S. Roger Jang"— Presentation transcript:

C. Y. Tang and J. S. Roger Jang
Chapter 5 Trees Instructors: C. Y. Tang and J. S. Roger Jang All the material are integrated from the textbook "Fundamentals of Data Structures in C" and  some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Outline (1) Introduction (5.1) Binary Trees (5.2)
Binary Tree Traversals (5.3) Additional Binary Tree Operations (5.4) Threaded Binary Trees (5.5) Heaps (5.6) & (Chapter 9) Binary Search Trees (5.7)

Outline (2) Selection Trees (5.8) Forests (5.9)
Set Representation (5.10) Counting Binary Trees (5.11) References & Exercises

5.1 Introduction What is a “Tree”? For Example : Figure 5.1 (a)
An ancestor binary tree Figure 5.1 (b) The ancestry of modern Europe languages

The Definition of Tree (1)
A tree is a finite set of one or more nodes such that : (1) There is a specially designated node called the root. (2) The remaining nodes are partitioned into n ≥ 0 disjoint sets T1, …, Tn, where each of these sets is a tree. We call T1, …, Tn, the sub-trees of the root. root T1 T2 Tn

The Definition of Tree (2)
The root of this tree is node A. (Fig. 5.2) Definitions: Parent (A) Children (E, F) Siblings (C, D) Root (A) Leaf / Leaves K, L, F, G, M, I, J…

The Definition of Tree (3)
The degree of a node is the number of sub-trees of the node. The level of a node: Initially letting the root be at level one For all other nodes, the level is the level of the node’s parent plus one. The height or depth of a tree is the maximum level of any node in the tree.

Representation of Trees (1)
List Representation The root comes first, followed by a list of sub-trees Example: (A(B(E(K,L),F),C(G),D(H(M),I, J))) data link 1 link 2 ... link n A node must have a varying number of link fields depending on the number of branches

Representation of Trees (2)
Left Child-Right Sibling Representation Fig.5.5 A Degree Two Tree Rotate clockwise by 45° A Binary Tree data left child right sibling

5.2 Binary Trees A binary tree is a finite set of nodes that is either empty or consists of a root and two disjoint binary trees called the left sub-tree and the right sub-tree. Any tree can be transformed into a binary tree. By using left child-right sibling representation The left and right subtrees are distinguished root The left sub-tree The right sub-tree

Abstract Data Type Binary_Tree (structure 5.1)
Structure Binary_Tree (abbreviated BinTree) is: Objects: a finite set of nodes either empty or consisting of a root node, left Binary_Tree, and right Binary_Tree. Functions: For all bt, bt1, bt2  BinTree, item  element Bintree Create()::= creates an empty binary tree Boolean IsEmpty(bt)::= if (bt==empty binary tree) return TRUE else return FALSE

BinTree MakeBT(bt1, item, bt2)::= return a binary tree
whose left subtree is bt1, whose right subtree is bt2, and whose root node contains the data item Bintree Lchild(bt)::= if (IsEmpty(bt)) return error else return the left subtree of bt element Data(bt)::= if (IsEmpty(bt)) return error else return the data in the root node of bt Bintree Rchild(bt)::= if (IsEmpty(bt)) return error else return the right subtree of bt

Special Binary Trees Skewed Binary Trees Complete Binary Trees
Fig.5.9 (a) Complete Binary Trees Fig.5.9 (b) This will be defined shortly

Properties of Binary Trees (1)
Lemma 5.1 [Maximum number of nodes] : (1) The maximum number of nodes on level i of a binary tree is 2i -1, i ≥ 1. (2) The maximum number of nodes in a binary tree of depth k is is 2k -1, k ≥ 1. The proof is by induction on i. Lemma 5.2 : For any nonempty binary tree, T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2, then n0 = n2 +1.

Properties of Binary Trees (2)
A full binary tree of depth k is a binary tree of depth k having 2k -1 nodes, k ≧ 0. A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes numbered from 1 to n in the full binary tree of depth k.

Binary Tree Representation
Array Representation (Fig. 5.11) Linked Representation (Fig. 5.13)

Array Representation Lemma 5.3 : If a complete binary tree with n nodes (depth = └log2n + 1┘) is represented sequentially, then for any node with index i, 1 ≦ i ≦ n, we have: (1) parent (i) is at └ i / 2 ┘, i ≠ 1. (2) left-child (i) is 2i, if 2i ≤ n. (3) right-child (i) is 2i+1, if 2i+1 ≤ n. For complete binary trees, this representation is ideal since it wastes no space. However, for the skewed tree, less than half of the array is utilized.

typedef struct node *tree_pointer; typedef struct node { int data; tree_pointer left_child, right_child; };

5.3 Binary Tree Traversals
Traversing order : L, V, R L : moving left V : visiting the node R : moving right Inorder Traversal : LVR Preorder Traversal : VLR Postorder Traversal : LRV

For Example Inorder Traversal : A / B * C * D + E
Preorder Traversal : + * * / A B C D E Postorder Traversal : A B / C * D * E +

Inorder Traversal (1) A recursive function starting from the root
Move left Visit node Move right

Inorder Traversal (2) In-order Traversal : A / B * C * D + E

Preorder Traversal A recursive function starting from the root
Visit node Move left  Move right

Postorder Traversal A recursive function starting from the root
Move left Move right Visit node

Other Traversals Iterative Inorder Traversal Level Order Traversal
Using a stack to simulate recursion Time Complexity: O(n), n is #num of node. Level Order Traversal Visiting at each new level from the left-most node to the right-most Using Data Structure : Queue

Iterative In-order Traversal (1)

Iterative In-order Traversal (2)
Delete “C” & Print Delete “*” & Print Add “D” Delete “D” & Print Delete “+” & Print Add “E” Delete “E” & Print Add “+” in stack Add “*” Add “/” Add “A” Delete “A” & Print Delete “/” & Print Add “B” Delete “B” & Print Delete “*” & Print Add “C” In-order Traversal : A / B * C * D + E

Level Order Traversal (1)

Level Order Traversal (2)
Add “+” in Queue Deleteq “+” Addq “*” Addq “E” Deleteq “*” Addq “D” Deleteq “E” Addq “/” Addq “C” Deleteq “D” Deleteq “/” Addq “A” Addq “B” Deleteq “C” Deleteq “A” Deleteq “B” Level-order Traversal : + * E * D / C A B

Copying Binary Trees Program 5.6 Testing for Equality of Binary Trees Program 5.7 The Satisfiability Problem (SAT)

Copying Binary Trees Modified from postorder traversal program

Testing for Equality of Binary Trees
Equality: 2 binary trees having identical topology and data are said to be equivalent.

SAT Problem (1) Formulas Variables : X1, X2, …, Xn
Two possible values: True or False Operators : And (︿), Or (﹀), Not (﹁) A variable is an expression. If x and y are expressions, then ﹁ x, x ︿ y, x ﹀y are expressions. Parentheses can be used to alter the normal order of evaluation, which is ﹁ before ︿ before ﹀.

SAT Problem (2)

SAT Problem (3) The SAT problem
Is there an assignment of values to the variables that causes the value of the expression to be true? For n variables, there are 2n possible combinations of true and false. The algorithm takes O(g 2n) time g is the time required to substitute the true and false values for variables and to evaluate the expression.

SAT Problem (4) Node Data Structure for SAT in C

SAT Problem (5) A Enumerated Algorithm Time Complexity : O (2n)

SAT Problem (6) void post_order_eval(tree_pointer node){ if (node){
post_order_eval(node->left_child); post_order_eval(node->right_child); switch(node->data){ case not: node->value=!node->right_child->value; break; case and: node->value=node->right_child->value && node->left_child->value; break; case or: node->value=node->right_child->value || case true: node->value=TRUE; break; case false: node->value=FALSE; break; } } }

Head node of the tree Actual tree

Inorder Traversal of a Threaded Binary Tree (1)
Threads simplify inorder traversal algorithm An easy O(n) algorithm (Program 5.11.) For any node, ptr, in a threaded binary tree If ptr -> right_thread = TRUE The inorder successor of ptr = ptr -> right_child Else (Otherwise, ptr -> right_thread = FALSE) Follow a path of left_child links from the right_child of ptr until finding a node with left_Thread = TRUE Function insucc (Program 5.10.) Finds the inorder successor of any node (without using a stack)

Inorder Traversal of a Threaded Binary Tree (2)

Inorder Traversal of a Threaded Binary Tree (2)

Inserting a Node into a Threaded Binary Tree
Insert a new node as a child of a parent node Insert as a left child (left as an exercise) Insert as a right child (see examples 1 and 2) Is the original child node an empty subtree? Empty child node (parent -> child_thread = TRUE) See example 1 Non-empty child node (parent -> child_thread = FALSE) See example 2

Inserting a node as the right child of the parent node (empty case)
parent(B) -> right_thread = FALSE child(D) -> left_thread & right_thread = TURE child -> left_child = parent child -> right_child = parent -> right_child parent -> right_child = child (1) (3) (2)

Inserting a node as the right child of the parent node (non-empty case)
(3) (2) (1) (4)

Right insertion in a threaded binary tree

5.6 Heaps An application of complete binary tree Definition
A max (or min) tree a tree in which the key value in each node is no smaller (or greater) than the key values in its children (if any). A max (or min) heap a max (or min) complete binary tree A max heap

Heap Operations Creation of an empty heap
PS. To build a Heap  O( n log n ) Insertion of a new element into the heap O (log2n) Deletion of the largest element from the (max) heap Application of Heap Priority Queues

Insertion into a Max Heap (1)
(Figure 5.28)

Insertion into a Max Heap (2)
void insert_max_heap(element item, int *n) { int i; if (HEAP_FULL(*n)){ fprintf(stderr, “the heap is full.\n); exit(1); } i = ++(*n); while ((i!=1) && (item.key>heap[i/2].key)) { heap[i] = heap[i/2]; i /= 2; heap[i] = item; the height of n node heap = ┌ log2(n+1) ┐ Time complexity = O (height) = O (log2n)

Deletion from a Max Heap
Delete the max (root) from a max heap Step 1 : Remove the root Step 2 : Replace the last element to the root Step 3 : Heapify (Reestablish the heap)

Delete_max_heap (1) element delete_max_heap(int *n) {
int parent, child; element item, temp; if (HEAP_EMPTY(*n)) { fprintf(stderr, “The heap is empty\n”); exit(1); } /* save value of the element with the highest key */ item = heap[1]; /* use last element in heap to adjust heap */ temp = heap[(*n)--];

Delete_max_heap (2) parent = 1; child = 2; while (child <= *n) {
/* find the larger child of the current parent */ if ((child < *n) && (heap[child].key<heap[child+1].key)) child++; if (temp.key >= heap[child].key) break; /* move to the next lower level */ heap[parent] = heap[child]; child *= 2; } heap[parent] = temp; return item;

5.7 Binary Search Trees Heap : search / delete arbitrary element
O(n) time Binary Search Trees (BST) Searching  O(h), h is the height of BST Insertion  O(h) Deletion  O(h) Can be done quickly by both key value and rank

Definition A binary search tree is a binary tree, that may be empty or satisfies the following properties : (1) every element has a unique key. (2&3) The keys in a nonempty left(/right) sub-tree must be smaller(/larger) than the key in the root of the sub-tree. (4) The left and right sub-trees are also binary search trees.

Searching a BST (1)

Searching a BST (2) Time Complexity
search  O(h), h is the height of BST. search2  O(h)

Inserting into a BST (1) Step 1 : Check if the inserting key is different from those of existing elements Run search function  O(h) Step 2 : Run insert_node function Program 5.17  O(h)

Inserting into a BST (2) void insert_node(tree_pointer *node, int num) { tree_pointer ptr, temp = modified_search(*node, num); if (temp || !(*node)) { ptr = (tree_pointer) malloc(sizeof(node)); if (IS_FULL(ptr)) { fprintf(stderr, “The memory is full\n”); exit(1); } ptr->data = num; ptr->left_child = ptr->right_child = NULL; if (*node) if (num<temp->data) temp->left_child=ptr; else temp->right_child = ptr; else *node = ptr;

Deletion from a BST Delete a non-leaf node with two children
Replace the largest element in its left sub-tree Or Replace the smallest element in its right sub-tree Recursively to the leaf  O(h)

Height of a BST The Height of the binary search tree is O(log2n), on the average. Worst case (skewed)  O(h) = O(n) Balanced Search Trees With a worst case height of O(log2n) AVL Trees, 2-3 Trees, Red-Black Trees Chapter 10

5.8 Selection Trees Application Problem
Merge k ordered sequences into a single ordered sequence Definition: A run is an ordered sequence Build a k-run Selection tree

Time Complexity Selection Tree’s Level  ┌ log2k ┐+ 1
Each time to restructure the tree O(log2k) Total time to merge n records O(n log2k)

For Example

Tree of losers The previous selection tree is called a winner tree
Each node records the winner of the two children Loser Tree Leaf nodes represent the first record in each run Each non-leaf node retains a pointer to the loser Overall winner is stored in the additional node, node 0 Each newly inserted record is now compared with its parent (not its sibling)  loser stays, winner goes up without storing. Slightly faster than winner trees

Loser tree example overall winner 8 6 10 8 9 20 6 11 12 13 90 14 17 15 4 5 7 2 3 1 9 9 15 15 Run 15 15 *Figure 5.36: Tree of losers corresponding to Figure 5.34 (p.235)

5.9 Forests A forest is a set of n ≧ 0 disjoint trees.
T1, …, Tn is a forest of trees Transforming a forest into a Binary Tree B(T1, …, Tn) (1) if n = 0, then return empty (2) a root (T1); Left sub-tree equal to B(T11,T12, …, T1m), where T11,T12, …, T1m are the sub-trees of root (T1); Right sub-tree B(T2, …, Tn)

Transforming a forest into a Binary Tree
Root(T1) T11,T12, T13 B(T2, T3)

Forest Traversals Pre-order : In-order : Post-order :

5.10 Set Representation Elements : 0, 1, …, n -1. Sets : S1, S2, …, Sm
pairwise disjoint If Si and Sj are two sets and i ≠ j, then there is no element that is in both Si and Sj. Operations Disjoint Set Union Ex: S1 ∪ S2 Find (i )

Union Operation Disjoint Set Union S1 ∪ S2 = {0, 6, 7, 8, 1, 4, 9}

Implement of Data Structure

Union & Find Operation Union(i, j) Find(i)
parent(i) = j  let i be the new root of j Find(i) While (parent[i]≧0) i = parent[i]  find the root of the set Return i;  return the root of the set

Performance Run a sequence of union-find operations
Total n-1 unions  n-1 times, O(n) Time of Finds  Σni=2 i = O(n 2)

Weighting rule for union(i, j)
If # of nodes in i < # of nodes in j Then j becomes the parent of i Else i becomes the parent of j

New Union Function Prevent the tree from growing too high
To avoid the creation of degenerate trees No node in T has level greater than log2n +1 void union2(int i, int j){ int temp = parent[i]+parent[j]; if (parent[i]>parent[j]) { parent[i]=j; parent[j]=temp; } else { parent[j]=i; parent[i]=temp;

Figure 5.45 Trees achieving worst case bound (p.245)

Collapsing Rule (for new find function)
Definition: If j is a node on the path from i to its root then make j a child of the root The new find function (see next slide): Roughly doubles the time for an individual find Reduces the worse case time over a sequence of finds.

New Find Function Collapse all nodes form i to root
To lower the height of tree

Performance of New Algorithm
Let T(m, n) be the maximum time required to process an intermixed sequence of m finds (m≧n) and n -1 unions, we have : k1mα(m, n) ≦ T(m, n) ≦ k2mα(m, n) k1, k2 : some positive constants α(m, n) is a very slowly growing function and is a functional inverse of Ackermann’s function A(p, q). Function A(p, q) is a very rapidly growing function.

Equivalence Classes Using union-find algorithms to processing the equivalence pairs of Section 4.6 (p.167) At most time : O(mα(2m, n)) Using less space

5.11 Counting Binary Trees Three disparate problems :
Having the same solution Determine the number of distinct binary trees having n nodes (problem 1) Determine the number of distinct permutations of the numbers from 1 to n obtainable by a stack (problem 2) Determine the number of distinct ways of multiply n + 1 matrices (problem 3)

Distinct binary trees N = 1 N = 2 N = 3 N = … only one binary tree

Stack Permutations (1) A binary tree traversals
Pre-order : A B C D E F G H I In-order : B C A E D G H F I Is this binary tree unique? Constructing this binary tree

Stack Permutations (2) For a given preorder permutation 1, 2, 3, what are the possible inorder permutations? Possible inorder permutation by a stack  (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 2, 1) (3, 1, 2) is impossible Each inorder permutation represents a distinct binary tree

Matrix Multiplication (1)
The product of n matrices M1 * M2 * … * Mn Matrix multiplication is associative Can be performed in any order N = 3, 2 ways to perform (M1 * M2) * M3 M1 * (M2 * M3) N = 4, 5 possibilities

Matrix Multiplication (2)
Let bn be the number of different ways to compute the product of n matrices. We have :

number of distinct binary trees
Approximation by solving the recurrence of the equation Solution : (when x →∞) Simplification : Approximation :

Heapsort—An optimal sorting algorithm
A heap : parent  son

output the maximum and restore:
Heapsort: construction output

Phase 1: construction restore the subtree rooted at A(2):
input data: 4, 37, 26, 15, 48 restore the subtree rooted at A(2): restore the tree rooted at A(1):

Phase 2: output

Implementation using a linear array not a binary tree.
The sons of A(h) are A(2h) and A(2h+1). time complexity: O(n log n)

Time complexity Phase 1: construction

Time complexity Phase 2: output

1 2 3 4 12 8 10

12 1 2 1 8 3 2 10 3 4

TSP是一個公認的難題 NP-Complete

2n相當可怕 10 30 50 N s s s N2 s s s 2n 0.001 s 17.9 min 35.7 year 像satisfiabilibility problem 目前只有exponential algorithm，還沒有人找到polynomial algorithm (你也不妨放棄！) 這一類問題是NP-Complete Problem Garey & Johnson “Computers & Intractability”

16種樹 n(n-2) Cayley’s Thm. Ref: Even, Graph Algorithms, PP26~28 2 1 1 4 3 12 4

Labeled tree  Number sequence One-to-One Mapping
N個nodes的labeled tree可以用一個 長度N-2的number sequence來表達。 Encoding: Data Compression.

Labeled treeNumber sequence

Number sequenceLabeled tree
Prune-sequence: 7,4,4,7,5 k 1 2 3 4 5 6 7 deg(k) Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5 Iteration 6 每一個iteration裡，選擇degree為1且編號最小的node，連接prune-sequence中 相對的node，之後兩個nodes的degree均減1. Iteration Iteration 2 Iteration 3 Iteration Iteration 6 Iteration 5 1 7 1 7 2 4 1 7 2 4 3 3 1 7 4 3 2 1 7 4 2 3 5 1 7 4 6 5 2 6

Minimal spanning tree Kruskal’a Algorithm
B 70 50 300 C 75 80 200 65 E D 90 Begin T <- null While T contains less than n-1 edges, the smallest weight, choose an edge (v, w) form E of smallest weight 【 Using priority queue, heap O (log n) 】, delete (v, w) form E. If the adding of (v, w) to T does not create a cycle in T,【 Using union, find O (log m)】 then add (v, w) to T; else discard (v, w). Repeat. End. O (m log m) m = # of edge

1 2 4 O(log n) Initial O(n) Tarjan: Union & Find可以almost linear (Amortized) Correctness 如果不選最小edge做tree而得到minimal 加入最小edge會有cycle Delete cycle中最大的edge會得到更小cost之tree (矛盾！) 3 7 5 6

1. 加 edge(2,3) 不合法 2. 加 edge(1,4) 合法 另一種看法： S1={1,2,3} S2={4,5} Edge的端點要在不同set Set的 Find, Union O(log n) 1 4 2 3 5