# A.E. Csallner Department of Applied Informatics University of Szeged Hungary.

## Presentation on theme: "A.E. Csallner Department of Applied Informatics University of Szeged Hungary."— Presentation transcript:

A.E. Csallner Department of Applied Informatics University of Szeged Hungary

Algorithm: Finite sequence of finite steps Provides the solution to a given problem Properties: Finiteness Definiteness Executability About algorithmsAlgorithms and Data Structures I2 Communication: Input Output

Design strategies: Bottom-up: synthesize smaller algorithmic parts into bigger ones Top-down: formulate the problem and repeatedly break it up into smaller and smaller parts About algorithmsAlgorithms and Data Structures I3

Example : Shoe a horse Algorithms and Data Structures I4 shoe a horse shoe a hoof drive a cog into a hoof hammer a horseshoe hammer a cog Structured programming a horse has four hooves need a horseshoeneed to fasten the horseshoe to the hoof need cogs

Algorithms and Data Structures I5Structured programming Basic elements of structured programming Sequence: series of actions Selection: branching on a decision Iteration: conditional repetition All structured algorithms can be defined using only these three elements (E.W. D IJKSTRA 1960s)

An algorithm description method defines an algorithm so that the description code should be unambiguous; programming language independent; still easy to implement; state-of-the-art Algorithms and Data Structures I6Algorithm description

Some possible types of classification: Age (when the description method was invented) Purpose (e.g. structural or object-oriented) Formulation (graphical or text code, etc.)... Algorithms and Data Structures I7Algorithm description

Most popular and useful description methods Flow diagram old not definitely structured(!) graphical very intuitive and easy to use Algorithms and Data Structures I8Algorithm description

Algorithms and Data Structures I9Algorithm description START STOP A possible notation of flow diagrams Circle:

Algorithms and Data Structures I10Algorithm description Any action execution can be given here Rectangle: A possible notation of flow diagrams

Algorithms and Data Structures I11Algorithm description Any yes/no question yes no Diamond: A possible notation of flow diagrams yes

Iteration Algorithms and Data Structures I12Algorithm description An example: A possible notation of flow diagrams START Need more horseshoes? Hammer a horseshoe Shoe a hoof STOP yes no Selection Sequence

Most popular and useful description methods Pseudocode old definitely structured text based very easy to implement Algorithms and Data Structures I13Algorithm description

Algorithms and Data Structures I14Algorithm description Assignment instruction: Looping constructs as in Pascal: for-do instruction (counting loop) for variable initial value to / downto final value do body of the loop Properties of a possible pseudocode

Algorithms and Data Structures I15Algorithm description while-do instruction (pre-test loop) while stay-in test do body of the loop repeat-until instruction (post-test loop) repeat body of the loop until exit test Properties of a possible pseudocode

Algorithms and Data Structures I16Algorithm description Conditional constructs as in Pascal: if-then-else instruction ( else clause is optional) if test then test passed clause else test failed clause Blocks are denoted by indentation Properties of a possible pseudocode

Algorithms and Data Structures I17Algorithm description Object identifiers are references Field of an object separator is a dot: object.field object.method object.method(formal parameter list) Empty reference is NIL Properties of a possible pseudocode

Algorithms and Data Structures I18Algorithm description Arrays are objects Parameters are passed by value Properties of a possible pseudocode

An example: ShoeAHorse ( Hooves ) hoof 1 while hoof Hooves.Count do horseshoe HammerAHorseshoe Hooves [ hoof ] horseshoe hoof hoof + 1 Algorithms and Data Structures I19Algorithm description Properties of a possible pseudocode Sequence Iteration

Algorithm classification on the I/O structure Sequence Value Sequence Sequence More sequences Sequence Sequence More sequences Algorithms and Data Structures I20Type algorithms

Sequence Value sequence calculations (e.g. summation, product of a series, linking elements together, etc.), decision (e.g. checking whether a sequence contains any element with a given property), selection (e.g. determining the first element in a sequence with a given property provided we know that there exists at least one), Algorithms and Data Structures I21Type algorithms

Sequence Value ( continued ) search (e.g. finding a given element), counting (e.g. counting the elements having a given property), minimum or maximum search (e.g. finding the least or the largest element). Algorithms and Data Structures I22Type algorithms

Sequence selection (e.g. collect the elements with a given property of a sequence), copying (e.g. copy the elements of a sequence to create a second sequence), sorting (e.g. arrange elements into an increasing order). Algorithms and Data Structures I23Type algorithms

More sequences Sequence union (e.g. set union of sequences), intersection (e.g. set intersection of sequences), difference (e.g. set difference of sequences), uniting sorted sequences (merging / combing two ordered sequences). Algorithms and Data Structures I24Type algorithms

Sequence More sequences filtering (e.g. filtering out elements of a sequence having given properties). Algorithms and Data Structures I25Type algorithms

Iterative algorithm Consists of two parts: Initialization (usually initializing data) Iteration (repeated part) Algorithms and Data Structures I26Special algorithms

Recursive algorithms Basic types: direct (self-reference) indirect (mutual references) Two alternative parts depending on the base criterion: Base case (if the problem is small enough) Recurrences (direct or indirect self-reference) Algorithms and Data Structures I27Special algorithms

An example of recursive algorithms: Towers of Hanoi Algorithms and Data Structures I28Special algorithms Aim: Move n disks from a rod to another, using a third one Rules: One disk moved at a time No disk on top of a smaller one

Algorithms and Data Structures I29Special algorithms 1 st step: move n –1 disks 2 nd step: move 1 disk 3 rd step: move n –1 disks Recursive solution of the problem

Pseudocode of the recursive solution TowersOfHanoi ( n, FirstRod, SecondRod, ThirdRod ) 1 if n > 0 2 then TowersOfHanoi ( n – 1, FirstRod, ThirdRod,SecondRod ) 3 write Move a disk from FirstRod to SecondRod 4 TowersOfHanoi ( n – 1, ThirdRod, SecondRod, FirstRod ) Algorithms and Data Structures I30Special algorithms line 2 line 3 line 4

Backtracking algorithms Backtracking algorithm: Sequence of systematic trials Builds a tree of decision branches Steps back (backtracking) in the tree if no branch at a point is effective Algorithms and Data Structures I31Special algorithms

An example of the backtracking algorithms Eight Queens Puzzle: Algorithms and Data Structures I32Special algorithms eight chess queens to be placed on a chessboard so that no two queens attack each other

Pseudocode of the iterative solution EightQueens 1 column 1 2 RowInColumn [ column ] 0 3 repeat 4 repeat inc ( RowInColumn [ column ]) 5 until IsSafe ( column, RowInColumn ) 6 if RowInColumn [ column ] > 8 7 then column column – 1 8 elseif column < 8 9 then column column + 1 10 RowInColumn [ column ] 0 11 else draw chessboard 12 until column = 0 Algorithms and Data Structures I33Special algorithms

Questions regarding an algorithm: Does it solve the problem? How fast does it solve the problem? How much storage place does it occupy to solve the problem? Algorithms and Data Structures I34Analysis of algorithms Complexity issues of the algorithm

Elementary storage or time: independent from the size of the input. Example 1 If an algorithm needs 500 kilobytes to store some internal data, this can be considered as elementary. Example 2 If an algorithm contains a loop whose body is executed 1000 times, it counts as an elementary algorithmic step. Algorithms and Data Structures I35Analysis of algorithms

Hence a block of instructions count as a single elementary step if none of the particular instructions depends on the size of the input. A looping construct counts as a single elementary step if the number of iterations it executes does not depend on the size of the input and its body is an elementary step. to shoe a horse can be considered as an elementary step it takes constant time (one step) to shoe a horse Algorithms and Data Structures I36Analysis of algorithms

The time complexity of an algorithm is a function depending on the size of the input. Notation : T ( n ) where n is the size of the input Function T can depend on more than one variable, e.g. T ( n,m ) if the input of the algorithm is an nm matrix. Algorithms and Data Structures I37Analysis of algorithms

Example: Find the minimum of an array. Minimum ( A ) 1 min A [1] 2 i 1 3 repeat 4 i i + 1 5 if A [ i ] < min 6 then min A [ i ] 7 until i A.Length 8 return min Algorithms and Data Structures I38Analysis of algorithms 1 1 n 1

Hence T ( n ) = n (where n = A.Length) Does this change if line 8 ( return min ) is considered as an extra step? In other words: n n + 1 It does not change! Proof: n + 1 = ( n 1) + 2 Algorithms and Data Structures I39Analysis of algorithms ? this counts as a single elementary step ( n 1) + 1 = n

This so-called asymptotic behavior can be formulated rigorously in the following way: We say that f ( x ) = O ( g ( x )) (big O notation) if ( C, x 0 > 0) ( x x 0 ) 0 f ( x ) Cg ( x ) means that g is an asymptotic upper bound of f Algorithms and Data Structures I40Analysis of algorithms

Algorithms and Data Structures I41Analysis of algorithms f ( x ) g(x)g(x) Cg ( x ) x0x0

The O notation denotes an upper bound. If g is also a lower bound of f then we say that f ( x ) = θ ( g ( x )) if (c, C, x 0 > 0) ( x x 0 ) 0 cg ( x ) f ( x ) Cg ( x ) means that f asymptotically equals g Algorithms and Data Structures I42Analysis of algorithms

Algorithms and Data Structures I43Analysis of algorithms f ( x ) g(x)g(x) Cg ( x ) x0Cx0C x0cx0c cg ( x ) =x 0

What does the asymptotic notation show us? We have seen: T ( n ) = θ ( n ) for the procedure Minimum ( A ) where n = A.Length However, due to the definition of the θ function T ( n ) = θ ( n ), T (2 n ) = θ ( n ), T (3 n ) = θ ( n )... Minimum does not run slower on more data? Algorithms and Data Structures I44Analysis of algorithms ?

What does the asymptotic notation show us? Asymtotic notation shows us the tendency: Algorithms and Data Structures I45Analysis of algorithms T ( n ) = θ ( n ) linear tendency n data a certain amount of time t 2 n data time 2 t 3 n data time 3 t T ( n ) = θ ( n 2 ) quadratic tendency n data a certain amount of time t 2 n data time 2 2 t = 4 t 3 n data time 3 2 t = 9 t

Recursive algorithm – recursive function T Example: Towers of Hanoi TowersOfHanoi ( n, FirstRod, SecondRod, ThirdRod ) 1 if n > 0 2 then TowersOfHanoi ( n – 1, FirstRod, ThirdRod,SecondRod ) 3 write Move a disk from FirstRod to SecondRod 4 TowersOfHanoi ( n – 1, ThirdRod, SecondRod, FirstRod ) Algorithms and Data Structures I46Analysis of algorithms T ( n )= T ( n 1) +T ( n 1) +1 =2T ( n 1)+1

T ( n ) = 2 T ( n 1) + 1 is a recursive function In general it is very difficult (sometimes insoluble) to determine the explicit form of an implicit (recursive) formula If the algorithm is recursive, the solution can be achieved using recursion trees. Algorithms and Data Structures I47Analysis of algorithms T ( n )= =2T ( n 1)+1

Recursion tree of TowersOfHanoi : Algorithms and Data Structures I48Analysis of algorithms 1 2 4 n n1n11 n1n1 n2n21 n2n2 n2n21 n2n2 1 1 1 1 1 11 1 2n12n1 2n12n1

Time complexity: T ( n ) = 2 n 1 = θ ( 2 n ) exponential time (very slow) Example: n = 64 (from the original legend) T ( n ) = 2 n 1 = 2 64 1 1.810 19 seconds = 310 17 minutes = 5.110 15 hours = 2.110 14 days = 5.810 11 years > half a trillion years Algorithms and Data Structures I49Analysis of algorithms = (assuming one disk move per second)

Problem (example) : search a given element in a sequence (array). LinearSearch ( A,w ) 1 i 0 2 repeat i i + 1 3 until A [ i ] = w or i = A. Length 4 if A [ i ] = w then return i 5 else return NIL Algorithms and Data Structures I50Analysis of algorithms

Array: Best case Element wanted: 8 Time complexity: T ( n ) = 1 = θ (1) Worst case Element wanted: 2 Time complexity: T ( n ) = n = θ ( n ) Algorithms and Data Structures I51Analysis of algorithms 8139562 Average case?

Array: The mean value of the time complexities on all possible inputs: T ( n ) = = n ( n + 1) / 2 n = ( n + 1) / 2 = θ ( n ) (The same as in the worst case) Algorithms and Data Structures I52Analysis of algorithms 8139562 Average case? 1+ 2+ 3+ 4+...+ n () / n = 8139562

To store a set of data of the same type in a linear structure, two basic solutions exist: Arrays: physical sequence in the memory Linked lists: the particular elements are linked together using links (pointers or indices) Arrays and linked listsAlgorithms and Data Structures I53 182922 182922 head keylink

SearchInsertDeleteMinimumMaximumSuccessorPredecessor Array O(n)O(n) O(n)O(n) O(n)O(n) O(n)O(n) O(n)O(n) O(n)O(n) O(n)O(n) Linked list O(n)O(n) O (1) O(n)O(n) O(n)O(n) O(n)O(n) O(n)O(n) Algorithms and Data Structures I54 Arrays vs. linked lists Time complexity of some operations on arrays and linked lists in the worst case Arrays and linked lists

Doubly linked lists: Dummy head lists: Indirection (indirect reference): pointer.key Double indirection: pointer.link.key Algorithms and Data Structures I55 182922 head 182922 dummy head X pointer to be continued... Arrays and linked lists

Array representation of linked lists Algorithms and Data Structures I56 182922 dummy head X 22X1829 12345678 key 0572 link 3 dummy head Problem: a lot of garbage Arrays and linked lists

Garbage collection for array-represented lists The empty cells are linked to a separate garbage list using the link array: Algorithms and Data Structures I57 22X1829 12345678 key 80507124 link 3 dummy head 6 garbage Arrays and linked lists

To allocate place for a new key and use it: the first element of the garbage list is linked out from the garbage and linked into the proper list with a new key (33 here) if necessary. Algorithms and Data Structures I58 22X183329 12345678 key 80507124 link 3 dummy head 6 garbage 1 6 6 new 5 Arrays and linked lists

Pseudocode for garbage management Allocate ( link ) 1 if link.garbage = 0 2 thenreturn 0 3 else new link. garbage 4 link.garbage link [ link.garbage ] 5 return new Free ( index,link ) 1 link [ index ] link.garbage 2 link.garbage index Algorithms and Data Structures I59Arrays and linked lists

Dummy head linked lists (...continued) FindAndDelete for dummy head linked lists FindAndDeleteDummy ( toFind,key,link ) 1 pointer link.dummyhead 2 while link [ pointer ] 0 and key [ link [ pointer ]] toFind 3 do pointer link [ pointer ] 4 if link [ pointer ] 0 5 then toDelete link [ pointer ] 6 link [ pointer ] link [ toDelete ] 7 Free ( toDelete,link ) Algorithms and Data Structures I61Arrays and linked lists

Common properties: only two operations are defined: store a new key (called push and enqueue, resp.) extract a key (called pop and dequeue, resp.) all (both) operations work in constant time Different properties: stacks are LIFO structures queues are FIFO (or pipeline) structures Stacks and queuesAlgorithms and Data Structures I62

Two erroneous cases: an empty data structure is intended to be extracted from: underflow no more space but insertion attempted: overflow Algorithms and Data Structures I63Stacks and queues

Stack management using arrays Algorithms and Data Structures I64 push(8) top Stack: push(1) push(3) push(9) Stack overflow pop Stack underflow 3 1 8 Stacks and queues

Stack management using arrays Push ( key,Stack ) 1 if Stack. top = Stack. Length 2 thenreturn Overflow error 3 else Stack.top Stack.top + 1 4 Stack [ Stack.top ] key Algorithms and Data Structures I65 stack overflow Stacks and queues

Stack management using arrays Pop ( Stack ) 1 if Stack.top = 0 2 thenreturn Underflow error 3 else Stack.top Stack.top 1 4 return Stack [ Stack.top + 1] Algorithms and Data Structures I66 stack underflow Stacks and queues

? Queue management using arrays Algorithms and Data Structures I67 Queue: 138245679 end beginning Empty queue: beginning = n end = 0 Stacks and queues

Queue management using arrays Enqueue ( key,Queue ) 1 if Queue. beginning = Queue. end 2 thenreturn Overflow error 3 elseif Queue. end = Queue. Length 4 then Queue. end 1 5 else Queue. end Queue. end + 1 6 Queue [ Queue. end ] key Algorithms and Data Structures I68 queue overflow Stacks and queues

Queue management using arrays Dequeue ( Queue ) 1 if Queue. end = 0 2 thenreturn Underflow error 3 elseif Queue. beginning = Queue. Length 4 then Queue. beginning 1 5 else inc ( Queue. beginning ) 6 key Queue [ Queue. beginning ] 7 if Queue. beginning = Queue. end 8 then Queue. beginning Queue. Length 9 Queue. end 0 10 return key Algorithms and Data Structures I69 queue underflow Stacks and queues

Linear data structures cannot provide better time complexity than n in some cases Idea : let us use another kind of structure Solution : rooted trees (especially binary trees) special order of keys (search trees) Binary search treesAlgorithms and Data Structures I70

A binary tree: Notions : Algorithms and Data Structures I71Binary search trees vertex (node) edge root twins (siblings) parent - child leaf levels depth (height)

A binary Algorithms and Data Structures I72Binary search trees 28 1230 21 1426 49 50 7 all keys in the left subtree are smaller tree: search all keys in the right subtree are greater for all vertices

Implementation of binary search trees: Algorithms and Data Structures I73Binary search trees 28 1230 21 1426 49 50 7 key and other data link to the left child link to the right child link to the parent

Binary search tree operations: tree walk inorder: 1. left 2. root 3. right Algorithms and Data Structures I74Binary search trees 28 1230 21 1426 49 50 7 7 12 14 21 26 28 30 49 50 increasing order

InorderWalk ( Tree ) 1 if Tree NIL 2 thenInorderWalk ( Tree.Left ) 3visit Tree, e.g. check it or list it 4 InorderWalk ( Tree.Right ) The so-called preorder and postorder tree walks only differ by the order of lines 2-4: preorder: root left right postorder: left right root Algorithms and Data Structures I75Binary search trees

Binary search tree operations: tree search Algorithms and Data Structures I76Binary search trees 28 1230 21 1426 49 50 7 TreeSearch(14) < < < TreeSearch(45) < < <

TreeSearch ( toFind,Tree ) 1 while Tree NIL and Tree.key toFind 2 doif toFind < Tree.key 3 then Tree Tree.Left 4 else Tree Tree.Right 5 return Tree Algorithms and Data Structures I77Binary search trees

Binary search tree operations: insert Algorithms and Data Structures I78Binary search trees 28 1230 21 1426 49 50 7 TreeInsert(14) < < < new vertices are always inserted as leaves

Binary search tree operations: tree minimum tree maximum Algorithms and Data Structures I79Binary search trees 28 1230 21 1426 49 50 7

TreeMinimum ( Tree ) 1 while Tree.Left NIL 2 do Tree Tree.Left 3 return Tree TreeMaximum ( Tree ) 1 while Tree.Right NIL 2 do Tree Tree.Right 3 return Tree Algorithms and Data Structures I80Binary search trees

Binary search tree operations: successor of an element Algorithms and Data Structures I81Binary search trees 28 1230 21 1426 49 50 7 TreeSuccessor(12) tree minimum TreeSuccessor(26) if the element has no right child: parent-left child relation

TreeSuccessor ( Element ) 1 if Element.Right NIL 2 thenreturn TreeMinimum ( Element.Right ) 3 else Above Element.Parent 4 while Above NIL and Element = Above.Right 5 do Element Above 6 Above Above.Parent 7 return Above Finding the predecessor is similar. Algorithms and Data Structures I82Binary search trees

Binary search tree operations: delete Algorithms and Data Structures I83Binary search trees 28 1230 21 1426 49 50 7 TreeDelete(26) 1.if the element has no children:

Binary search tree operations: delete Algorithms and Data Structures I84Binary search trees 28 1230 21 1426 7 49 50 TreeDelete(30) 2.if the element has only one child:

7 Binary search tree operations: delete Algorithms and Data Structures I85Binary search trees 28 1230 21 2614 49 50 TreeDelete(12) 3.if the element has two children: 12 is substituted for a close key, e.g. the successor, 14 the successor, found in the right subtree has at most one child tree minimum

The case if Element has no children: TreeDelete ( Element,Tree ) 1 if Element.Left = NIL and Element.Right = NIL 2 thenif Element.Parent = NIL 3 then Tree NIL 4 elseif Element = ( Element.Parent ). Left 5 then ( Element.Parent ). Left NIL 6 else ( Element.Parent ). Right NIL 7 Free ( Element ) 8 return Tree 9- next page Algorithms and Data Structures I86Binary search trees

The case if Element has only a right child: -8 previous page 9 if Element.Left = NIL and Element.Right NIL 10 thenif Element.Parent = NIL 11 then Tree Element.Right 12( Element.Right ). Parent NIL 13 else ( Element.Right ). Parent Element.Parent 14 if Element = ( Element.Parent ). Left 15 then ( Element.Parent ). Left Element.Right 16 else ( Element.Parent ). Right Element.Right 17 Free ( Element ) 18 return Tree 19- next page Algorithms and Data Structures I87Binary search trees

The case if Element has only a left child: -18 previous page 19 if Element.Left NIL and Element.Right = NIL 20 thenif Element.Parent = NIL 21 then Tree Element.Left 22( Element.Left ). Parent NIL 23 else ( Element.Left ). Parent Element.Parent 24 if Element = ( Element.Parent ). Left 25 then ( Element.Parent ). Left Element.Left 26 else ( Element.Parent ). Right Element.Left 27 Free ( Element ) 28 return Tree 29- next page Algorithms and Data Structures I88Binary search trees Very similar to the previous case

The case if Element has two children: -28 previous page 29 if Element.Left NIL and Element.Right NIL 30 then Substitute TreeSuccessor ( Element ) 31 if Substitute.Right NIL 32 then ( Substitute.Right ). Parent Substitute.Parent 33 if Substitute = ( Substitute.Parent ). Left 34 then ( Substitute.Parent ). Left Substitute.Right 35 else ( Substitute.Parent ). Right Substitute.Right 36 Substitute.Parent Element.Parent 37 if Element.Parent = NIL 38 then Tree Substitute 39 elseif Element = ( Element.Parent ). Left 40 then ( Element.Parent ). Left Substitute 41 else ( Element.Parent ). Right Substitute 42 Substitute.Left Element.Left 43( Substitute.Left ). Parent Substitute 44 Substitute.Right Element.Right 45( Substitute. Right ). Parent Substitute 27 Free ( Element ) 28 return Tree Algorithms and Data Structures I89Binary search trees Substitute is linked out from its place Substitute is linked into Element s place

Time complexity of binary search tree operations T ( n ) = O ( d ) for all operations (except for the walk), where d denotes the depth of the tree The depth of any randomly built binary search tree is d = O (log n ) Hence the time complexity of the search tree operations in the average case is T ( n ) = O (log n ) Algorithms and Data Structures I90Stacks and queues

If insert and delete is used rarely then it is more convenient and faster to use an oredered array instead of a binary search tree. Faster: the following operations have T ( n ) = O (1) constant time complexity: minimum, maximum, successor, predecessor. Binary searchAlgorithms and Data Structures I91 Search has the same T ( n ) = O (log n ) time complexity as on binary search trees:

Let us search key 29 in the ordered array below: Binary searchAlgorithms and Data Structures I92 Search has the same T ( n ) = O (log n ) time complexity as on binary search trees: 23712293145 search here central element <

Let us search key 29 in the ordered array below: Binary searchAlgorithms and Data Structures I93 Search has the same T ( n ) = O (log n ) time complexity as on binary search trees: 23712293145 search here central element <

Let us search key 29 in the ordered array below: Binary searchAlgorithms and Data Structures I94 Search has the same T ( n ) = O (log n ) time complexity as on binary search trees: 23712293145 search here central element = found!

This result can also be derived from: if we halve n elements k times, we get 1 n / 2 k = 1 k = log 2 n = O (log n ) Binary searchAlgorithms and Data Structures I95 Search has the same T ( n ) = O (log n ) time complexity as on binary search trees: 23712293145 O (log n )

Problem There is a set of data from a base set with a given order over it (e.g. numbers, texts). Arrange them according to the order of the base set. Example SortingAlgorithms and Data Structures I96 12273 sorting

Sorting sequences We sort sequences in a lexicographical order: from two sequences the sequence is smaller which has a smaller value at the first position where they differ. Example (texts) SortingAlgorithms and Data Structures I97 goodgone ? n < o in the alphabet <

75 69 22 14 8 Principle Insertion sortAlgorithms and Data Structures I98

Implementation of insertion sort with arrays insertion step: Insertion sortAlgorithms and Data Structures I99 2269753814 sorted partunsorted part

InsertionSort( A ) 1 for i 2 to A. Length 2 do ins A [ i ] 3 j i – 1 4 while j > 0 and ins < A [ j ] 5 do A [ j + 1] A [ j ] 6 j j – 1 7 A [ j + 1] ins Insertion sortAlgorithms and Data Structures I100

Time complexity of insertion sort Best case In each step the new element is inserted to the end of the sorted part: T ( n ) = 1 + 1 + 1 +...+ 1 = n 1 = θ ( n ) Worst case In each step the new element is inserted to the beginning of the sorted part: T ( n ) = 2 + 3 + 4 +...+ n = n ( n + 1)/2 1 = θ ( n 2 ) Insertion sortAlgorithms and Data Structures I101

Time complexity of insertion sort Average case In each step the new element is inserted somewhere in the middle of the sorted part: T ( n )= 2/2 + 3/2 + 4/2 +...+ n /2 = = ( n ( n + 1)/2 1) / 2 = θ ( n 2 ) The same as in the worst case Insertion sortAlgorithms and Data Structures I102

Another implementation of insertion sort The input is providing elements continually (e.g. file, net) The sorted part is a linked list where the elements are inserted one by one The time complexity is the same in every case. Insertion sortAlgorithms and Data Structures I103

Another implementation of insertion sort The linked list implementation delivers an on-line algorithm: after each step the subproblem is completely solved the algorithm does not need the whole input to partially solve the problem Cf. off-line algorithm: the whole input has to be known prior to the substantive procedure Insertion sortAlgorithms and Data Structures I104

Principle Merge sortAlgorithms and Data Structures I105 69148752222536 sort the parts recursively 14869752225236

2225366975 Merge sortAlgorithms and Data Structures I106 merge (comb) the parts 2814 ready

Time complexity of merge sort Merge sort is a recursive algorithm, and so is its time complexity function T ( n ) What it does: First it halves the actual (sub)array: O (1) Then calls itself for the two halves: 2 T ( n /2) Last it merges the two ordered parts: O ( n ) Hence T ( n ) = 2 T ( n /2) + O ( n ) = ? Algorithms and Data Structures I107Merge sort

Recursion tree of merge sort: Algorithms and Data Structures I108 n 2( n /2) n n /2 n /4 1 11 1 n log n 4( n /4) n Merge sort

Time complexity of merge sort is T ( n ) = θ ( n log n ) This worst case time complexity is optimal among comparison sorts (using only pair comparisons) f ast but unfortunately merge sort does not sort in-place, i.e. it uses auxiliary storage of a size comparable with the input Algorithms and Data Structures I109Merge sort

An array A is called heap if for all its elements A [ i ] A [2 i ] and A [ i ] A [2 i + 1] This property is called heap property It is easier to understand if a binary tree is built from the elements filling the levels row by row HeapsortAlgorithms and Data Structures I110 4527342023311819314

HeapsortAlgorithms and Data Structures I111 4527342023311819314

HeapsortAlgorithms and Data Structures I112 45 2734 20233118 19314 1 23 4567 8910 The heap property turns into a simple parent-child relation in the tree representation

An important application of heaps is realizing priority queues: A data structure supporting the operations insert maximum (or minimum) extract maximum (or extract minimum) Algorithms and Data Structures I113Heapsort

First we have to build a heap from an array. Let us suppose that only the k th element infringes the heap property. In this case it is sunk level by level to a place where it fits. In the example k = 1 (the root): Algorithms and Data Structures I114Heapsort

Algorithms and Data Structures I115 15 3734 20233118 19314 1 23 4567 8910 k = 1 The key and its children are compared It is exchanged for the greater child

HeapsortAlgorithms and Data Structures I116 37 1534 20233118 19314 1 23 4567 8910 k = 2 The key and its children are compared It is exchanged for the greater child

HeapsortAlgorithms and Data Structures I117 37 2334 20153118 19314 1 23 4567 8910 k = 5 The key and its children are compared It is the greatest ready

Sink ( k,A ) 1 if 2* k A.HeapSize and A [2* k ] > A [ k ] 2 then greatest 2* k 3 else greatest k 4 if 2* k + 1 A.HeapSize and A [2* k + 1] > A [ greatest ] 5 then greatest 2* k + 1 6 if greatest k 7 thenExchange ( A [ greatest ], A [ k ]) 8 Sink ( greatest, A ) Algorithms and Data Structures I118Heapsort

To build a heap from an arbitrary array, all elements are mended by sinking them: BuildHeap ( A ) 1 A.HeapSize A.Length 2 for k A.Length / 2 downto 1 3 doSink ( k,A ) Algorithms and Data Structures I119Heapsort this is the arrays last element that has any children we are stepping backwards; this way every visited element has only ancestors which fulfill the heap property

Time complexity of building a heap To sink an element costs O (log n ) in the worst case Since n /2 elements have to be sunk, an upper bound for the BuildHeap procedure is T ( n ) = O ( n log n ) It can be proven that the sharp bound is T ( n ) = θ ( n ) Algorithms and Data Structures I120Heapsort

Time complexity of the priority queue operations if the queue is realized using heaps insert append the new element to the array O (1) exchange it for the root O (1) sink the root O (log n ) The time complexity is T ( n ) = O (log n ) Algorithms and Data Structures I121Heapsort

Time complexity of the priority queue operations if the queue is realized using heaps maximum read out the key of the root O (1) The time complexity is T ( n ) = O (1) Algorithms and Data Structures I122Heapsort

Time complexity of the priority queue operations if the queue is realized using heaps extract maximum exchange the root for the arrays last element O (1) extract the last element O (1) sink the root O (log n ) The time complexity is T ( n ) = O (log n ) Algorithms and Data Structures I123Heapsort

The heapsort algorithm build a heap θ ( n ) iterate the following ( n 1) O (log n ) = O ( n log n ) : exchange the root for the arrays last element O (1) exclude the heaps last element from the heap O (1) sink the root O (log n ) The time complexity is T ( n ) = O ( n log n ) Algorithms and Data Structures I124Heapsort

HeapSort ( A ) 1 BuildHeap ( A ) 2 for k A.Length downto 2 3 doExchange ( A [1], A [ A.HeapSize ]) 4 A.HeapSize A.HeapSize – 1 5 Sink (1,A ) Algorithms and Data Structures I125Heapsort

Principle QuicksortAlgorithms and Data Structures I126 692287512142536 Rearrange and part the elements so that every key in the first part is smaller than any in the second part.

Principle QuicksortAlgorithms and Data Structures I127 121487569222536 Rearrange and part the elements so that every key in the first part is smaller than any in the second part.

Principle QuicksortAlgorithms and Data Structures I128 121487569222536 Sort each part recursively, 128142236692575 this will result in the whole array being sorted. 128142236692575

The partition algorithm choose any of the keys stored in the array; this will be the so-called pivot key exchange the large elements at the beginning of the array to the small ones at the end of it Algorithms and Data Structures I129 69228751214253622 pivot key not less than the pivot key not greater than the pivot key Quicksort

Partition ( A,first,last ) 1l eft first – 1 2 right last + 1 3 pivotKey A [ RandomInteger ( first,last )] 4 repeat 5 repeat left left + 1 6 until A [ left ] pivotKey 7 repeat right right – 1 8 until A [ right ] pivotKey 9 if left < right 10 thenExchange ( A [ left ], A [ right ]) 11 elsereturn right 12 until false Algorithms and Data Structures I130Quicksort

The time complexity of the partition algorithm is T ( n ) = θ ( n ) because each element is visited exactly once. The sorting is then: QuickSort ( A,first,last ) 1 if first < last 2 then border Partition ( A,first,last ) 3 QuickSort ( A,first,border ) 4 QuickSort ( A,border +1,last ) Algorithms and Data Structures I131Quicksort

Quicksort is a divide and conquer algorithm like merge sort, however, the partition is unbalanced (merge sort always halves the subarray). The time complexity of a divide and conquer algorithm highly depends on the balance of the partition. In the best case the quicksort algorithm halves the subarrays at every step T ( n ) = θ ( n log n ) Algorithms and Data Structures I132Quicksort

Recursion tree of the worst case Algorithms and Data Structures I133Quicksort n n 1 n 1 1 n 2 1 1 n ( n + 1) / 2 n 2 0

Thus, the w orst case time complexity of sort is T ( n ) = θ ( n 2 ) The a verage case time complexity is T ( n ) = θ ( n log n ) the same as in the best case! The proof is difficult but let s see a special case to understand quicksort better. Algorithms and Data Structures I134Quicksort quick

Let λ be a positive number smaller than 1: 0 < λ < 1 Assumption: the partition algorithm never provides a worse partition ratio than (1 λ ) : λ Example 1: Let λ := 0.99 The assumption demands that the partition algorithm does not leave less than 1% as the smaller part. Algorithms and Data Structures I135Quicksort

Example 2: Let λ := 0.999 999 999 Due to the assumption, if we have at most one billion(!) elements then the assumption is fulfilled for any functioning of the partition algorithm. (Even if it always cuts off only one element from the others). In the following it is assumed for the sake of simplicity that λ 0.5, i.e. always the λ part is bigger. Algorithms and Data Structures I136Quicksort

Algorithms and Data Structures I137Quicksort Recursion tree of the λ ratio case n (1 λ ) n λnλn (1 λ ) λ n λ2nλ2n λdnλdn n log n n n n n

In the special case if none of the parts arising at the partitions are bigger than a given λ ratio (0.5 λ < 1), the time complexity of quicksort is T ( n ) = O ( n log n ) The time complexity of quicksort is practically optimal because the number of elements to be sorted is always bounded by a number N (finite storage). Using the value λ = 1 1/ N it can be proven that quicksort finishes in O ( n log n ) time in every possible case. Algorithms and Data Structures I138Quicksort

Problem Optimization problem: Let a function f(x ) be given. Find an x where f is optimal (minimal or maximal) under given circumstances Given circumstances: An optimization problem is constrained if functional constraints have to be fulfilled such as g ( x ) 0 Greedy algorithmsAlgorithms and Data Structures I139

Feasible set: the set of those x values where the given constraints are fulfilled Constrained optimization problem: minimize f ( x ) subject to g ( x ) 0 Algorithms and Data Structures I140Greedy algorithms

Example Problem : There is a city A and other cities B 1, B 2,..., B n which can be reached from A by bus directly. Find the farthest of these cities where you can travel so that your money suffices. Algorithms and Data Structures I141Greedy algorithms A B1B1 B2B2 BnBn...

Model : Let x denote any of the cities: x { B 1, B 2,..., B n }, f ( x ) the distance between A and x, t ( x ) the price of the bus ticket from A to x, m the money you have, and g ( x ) = t ( x ) m the constraint function. The constrained optimization problem to solve: minimize ( f ( x )) s.t. g ( x ) 0 Algorithms and Data Structures I142Greedy algorithms

In general, optimization problems are much more difficult! However, there is a class of optimization problems which can be solved using a step-by- step simple straightforward principle: greedy algorithms: at each step the same kind of decision is made, striving for a local optimum, and decisions of the past are never revisited. Algorithms and Data Structures I143Greedy algorithms

Question :Which problems can be solved using greedy algorithms? Answer : Problems which obey the following two rules: Greedy choice property: If a greedy choice is made first, it can always be completed to achieve an optimal solution to the problem. Optimal substructure property: Any substructure of an optimal solution provides an optimal solution to the adequate subproblem. Algorithms and Data Structures I144Greedy algorithms

Counter example Find the shortest route from Szeged to Budapest. The greedy choice property is infringed: You cannot simply choose the closest town first Algorithms and Data Structures I145Greedy algorithms

Algorithms and Data Structures I146Greedy algorithms Budapest Szeged Deszk Deszk is the closest to Szeged but situated in the opposite direction

Proper example Activity-selection problem: Lets spend a day watching TV. Aim : Watch as many programs (on the wole) as you can. Greedy strategy : Watch the program ending first, then the next you can watch on the whole ending first, etc. Algorithms and Data Structures I147Activity-selection problem

Algorithms and Data Structures I148Activity-selection problem Lets sort the programs by their ending timesInclude the first oneExclude those which have already begunNo more programs left: ready The optimum is 4 (TV programs)

Algorithms and Data Structures I149Activity-selection problem Check the greedy choice property: The first choice of any optimal solution can be exchanged for the greedy one

Algorithms and Data Structures I150Activity-selection problem Check the optimal substructure property: The part of an optimal solution is optimal also for the subproblem If this was not optimal for the subproblem, the whole solution could be improved by improving the subproblems solution

Notions C is an alphabet if it is a set of symbols F is a file over C if it is a text built up of the characters of C Huffman codesAlgorithms and Data Structures I151

Assume we have the following alphabet C = { a, b, c, d, e } Code it with binary codewords of equal length How many bits per codeword do we need at least? 2 are not enough (only four codewords: 00, 01, 10, 11) Build codewords using 3 bit coding Huffman codesAlgorithms and Data Structures I152 a = 000 b = 001 c = 010 d = 011 e = 100

Build the T binary tree of the coding Huffman codesAlgorithms and Data Structures I153 a = 000 b = 001 c = 010 d = 011 e = 100 01 a = 000 b = 001 c = 010 d = 011 e = 100 001 a = 000 b = 001 c = 010 d = 011 e = 100 abcde 00011 a = 000 b = 001 c = 010 d = 011 e = 100 a = 000 b = 001 c = 010 d = 011 e = 100 a = 000 b = 001 c = 010 d = 011 e = 100 c a = 000 b = 001 c = 010 d = 011 e = 100

Further notation For each c C character its frequency in the file is denoted by f (c) For each c C character its length is defined by its depth in the T tree of coding, d T (c) Hence the length of the file (in bits) equals B ( T )= c C f ( c ) d T ( c ) Huffman codesAlgorithms and Data Structures I154

Problem Let a C alphabet and a file over it given. Find a T coding of the alphabet with minimal B ( T ) Huffman codesAlgorithms and Data Structures I155

Example Consider an F file of 20,000 characters over the alphabet C = { a, b, c, d, e } Assume the frequencies of the particular characters in the file are Huffman codesAlgorithms and Data Structures I156 f ( a ) = 5,000 f ( b ) = 2,000 f ( c ) = 6,000 f ( d ) = 3,000 f ( e ) = 4,000

Using the 3 bit coding defined previously, the bit- length of the file equals B ( T )= c C f ( c ) d T ( c )= 5,000 3+2,000 3+6,000 3+3,000 3+4,000 3= (5,000+2,000+6,000+3,000+4,000) 3= 20,000 3=60,000 This is a so-called fixed-length code since for all x, y C d T ( x )= d T ( y ) holds Huffman codesAlgorithms and Data Structures I157

The fixed-length code is not always optimal Huffman codesAlgorithms and Data Structures I158 01 001 e 0 abcd 0011 B ( T )= B ( T ) f ( e ) 1= 60,0004,000 1 = 56,000

Idea Construct a variable-length code, i.e., where the code-lengths for different characters can differ from each other We expect that if more frequent characters get shorter codewords then the resulting file will become shorter Huffman codesAlgorithms and Data Structures I159

Problem: How do we recognize when a codeword ends and a new begins. Using delimiters is too expensive Solution: Use prefix codes, i.e., codewords none of which is also a prefix of some other codeword Result: The codewords can be decoded without using delimiters Huffman codesAlgorithms and Data Structures I160

For instance if then the following codes meaning is 1000010000010010 = However, what if a variable-length code was not prefix-free: Huffman codesAlgorithms and Data Structures I161 acbccab a = 10 b = 010 c = 00

Then if then 100= b or 100= a c ? An extra delimiter would be needed Huffman codesAlgorithms and Data Structures I162 a = 10 b = 100 c = 0 a = 10 b = 100 c = 0

Realize the original idea with prefix codes Huffman codesAlgorithms and Data Structures I163 f ( a ) = 5,000 f ( b ) = 2,000 f ( c ) = 6,000 f ( d ) = 3,000 f ( e ) = 4,000 rare frequent Frequent codewords should be shorter, e.g., a = 00, c = 01, e = 10 Rare codewords can be longer, e.g., b = 110, d = 111

Question: How can such a coding be done algorithmically? Answer: The Huffman codes provide exactly this solution Huffman codesAlgorithms and Data Structures I164

The bitlength of the file using this K prefix code is B ( K )= c C f ( c ) d K ( c )= 5,000 2+2,000 3+6,000 2+3,000 3+4,000 2= (5,000+6,000+4,000) 2+(2,000+3,000 ) 3= 30,000+15,000=45,000 (cf. the fix-length codes gave 60,000, the improved one 56,000) Huffman codesAlgorithms and Data Structures I165

The greedy method producing Huffman codes 1. Sort the characters of the C alphabet in increasing order according to their frequency in the file and link them to an empty list 2. Delete the two leading characters, some x and y from the list and connect them with a common parent z node. Let f ( z )= f ( x )+ f ( y ), insert z into the list and repeat step 2 until the the list runs empty. Huffman codesAlgorithms and Data Structures I166

List: Example Huffman codesAlgorithms and Data Structures I167 a : 5 b : 2 c : 6 d : 3 e : 4 characterfrequency (thousands)

List: Example 1. Sort Huffman codesAlgorithms and Data Structures I168 a : 5 b : 2 c : 6 d : 3 e : 4

List: Example 2. Merge and rearrange Huffman codesAlgorithms and Data Structures I169 e : 4 a : 5 c : 6 b : 2 d : 3 5

List: Example 2. Merge and rearrange Huffman codesAlgorithms and Data Structures I170 e : 4 a : 5 c : 6 b : 2 d : 3 59

List: Example 2. Merge and rearrange Huffman codesAlgorithms and Data Structures I171 a : 5 c : 6 e : 4 b : 2 d : 3 5 911

List: Example 2. Merge and rearrange Huffman codesAlgorithms and Data Structures I172 e : 4 b : 2 d : 3 5 9 a : 5 c : 6 1120 0 0 0 0 1 11 1

Example Ready Huffman codesAlgorithms and Data Structures I173 e : 4 b : 2 d : 3 5 9 a : 5 c : 6 11 20 0 0 0 0 1 11 1 a = 10 b = 010 c = 11 d = 011 e = 00

Example Length of file in bits Huffman codesAlgorithms and Data Structures I174 a = 10 b = 010 c = 11 d = 011 e = 00 B ( H )= c C f ( c ) d H ( c )= 5,000 2+2,000 3+6,000 2+3,000 3+4,000 2= (5,000+6,000+4,000) 2+(2,000+3,000 ) 3= 30,000+15,000=45,000 f ( a ) = 5,000 f ( b ) = 2,000 f ( c ) = 6,000 f ( d ) = 3,000 f ( e ) = 4,000

Optimality of the Huffman codes Assertion 1. There exists an optimal solution where the two rarest characters are deepest twins in the tree of the coding Assertion 2. Merging two (twin) characters leads to a problem similar to the original one Corollary. The Huffman codes provide an optimal character coding Huffman codesAlgorithms and Data Structures I175

Proof of Assertion 1 ( There exists an optimal solution where the two rarest characters are deepest twins in the tree of the coding). Huffman codesAlgorithms and Data Structures I176 Two rarest characters Changing nodes this way the total lenght does not increase

Proof of Assertion 2 ( Merging two (twin) characters leads to a problem similar to the original one). Huffman codesAlgorithms and Data Structures I177 Twin characters The new problem is smaller than the original one but similar to it

Graphs can represent different structures, connections and relations GraphsAlgorithms and Data Structures I178 1 4 2 3 Weighted graphs can represent capacities or actual flow rates 7 2 4 5 1 4 2 3 7 2 4 5

1: there is an edge leading from row to column 0: there is no such edge Algorithms and Data Structures I179 1 4 2 3 7 2 4 5 Adjacency-matrix Graphs 1234 1 0101 2 1001 3 0001 4 1110 1234 1 0207 2 2004 3 0005 4 7450 1234 1 207 2 04 3 5 4 Drawback 1: redundant elements Drawback 2: superfluous elements 1234 1 207 2 04 3 5 4

Optimal storage usage Drawback: slow search operations Algorithms and Data Structures I180 1 4 2 3 Adjacency-list Graphs 1 24 2 41 3 4 4 132

Problem : find the shortest path between two vertices in a graph Source : the starting point (vertex) Single-source shortest path method : algorithm to find the shortest path to all vertices in a graph running out GraphsAlgorithms and Data Structures I181

Walk a graph: choose an initial vertex as the source visit all vertices starting from the source Graph walk methods: depth-first search breadth-first search Graph walkAlgorithms and Data Structures I182

Depth-first search Backtrack algorithm It goes as far as it can without revisiting any vertex, then backtracks Algorithms and Data Structures I183 source Graph walk

Breadth-first search Like an explosion in a mine The shockwave reaches the adjacent vertices first, and starts over from them Algorithms and Data Structures I184Graph walk

The breadth-first search is not only simpler to implement but it is also the basis for several important graph algorithms (e.g. Dijkstra) Notation in the following pseudocode: A is the adjacency-matrix of the graph s is the source D is an array containing the distances from the source P is an array containing the predecessor along a path Q is the queue containing the unprocessed vertices already reached Algorithms and Data Structures I185Graph walk

BreadthFirstSearch( A,s,D,P ) 1 for i 1 to A. CountRows 2 do P [ i ] 0 3 D [ i ] 4 D [ s ] 0 5 Q.Enqueue ( s ) 6 repeat 7 v Q.Dequeue 8 for j 1 to A. CountColumns 9 doif A [ v,j ] > 0 and D [ j ] = 10 then D [ j ] D [ v ] + 1 11 P [ j ] v 12 Q.Enqueue ( j ) 13 until Q.IsEmpty Algorithms and Data Structures I186Graph walk

The D,P pairs are displayed in the figure. Algorithms and Data Structures I187Graph walk 1 4 2 3 5 6 8 9 7 10 0,0 1,4 2,6 3,9 2,6 3,9 D is the shortest distance from the source The shortest paths can be reconstructed using P

Problem : find the shortest path between two vertices in a weighted graph Idea : extend the breadth-first search for graphs having integer weights: Dijkstras algorithmAlgorithms and Data Structures I188 3 virtual vertices unweighted edges (total weight = 31 = 3)

Dijkstra( A,s,D,P ) 1 for i 1 to A. CountRows 2 do P [ i ] 0 3 D [ i ] 4 D [ s ] 0 5 for i 1 to A. CountRows 6 do M.Enqueue ( i ) 7 repeat 8 v M.ExtractMinimum 9 for j 1 to A. CountColumns 10 doif A [ v,j ] > 0 11 thenif D [ j ] > D [ v ] + A [ v,j ] 12 then D [ j ] D [ v ] + A [ v,j ] 13 P [ j ] v 14 until M.IsEmpty Algorithms and Data Structures I189Dijkstras algorithm minimum priority queue

Time complexity of Dikstras algorithm Initialization of D and P : O ( n ) Building a heap for the priority queue: O ( n ) Search: n O (log n + n ) = O ( n (log n + n )) = O ( n 2 ) Grand total: T ( n ) = O ( n 2 ) Algorithms and Data Structures I190Dijkstras algorithm extracting the minimum checking all neighbors number of loop executions

Download ppt "A.E. Csallner Department of Applied Informatics University of Szeged Hungary."

Similar presentations