Download presentation

Presentation is loading. Please wait.

Published byTucker Maultsby Modified about 1 year ago

1
Recursion Breaking down problems into solvable subproblems Chapter 7

2
The good and the bad Pluses –Code easy to understand –May give different view of problem –Sometimes, efficient Bad –overhead of functions calls (20-30%) –may be hard to analyze –cost not apparent –can be very inefficient: exponential worse!

3
Finite Induction Finite Induction principle –Let S n be a proposition depending on n –Base Case: Show that S 1 is true –Inductive case: Assume S k and prove S k+1 –Conclude: S n is true for all n >= 1. Variant: –Base Case: Show that S 1 is true –Inductive case: Assume S 1,S 2 … S k and prove S k+1 –Conclude: S n is true for all n >= 1.

4
Example Proofs involve two tricks –recognize subproblem (make take rearranging) –follow your nose S n : 1+2…+n = n(n+1)/2 –Base case: S 1 is true 1 = 1(1+1)/2 check –Inductive Case Given S k, show S k+1, i.e. 1+2+…k+1 = something k+ k+1 = k(k+1)/2 + k+1 = (k*k+k+ 2k+2)/2 = (k+1)(k+2)/2. QED.

5
Example: Complete k-ary tree Sn: A full m-ary tree of depth n has (m n+1 -1)/(m-1) nodes S0: (m 1 -1)/(m-1) = 1. Assume S k (visualize it)... m m^2 … m^k at depth k

6
The Math Visualization provides the recurrence Nodes in a S k+1 trees is # in S k tree + m*number of leaves in S k = (m k+1 -1)/(m-1) +m*m k = (m k+1 -1)/(m-1) + m^(k+2)/(m-1) -m^(k+1)/(m-1) = (m^(k+2) -1)/(m-1)

7
Fibonacci Fib(n) if (n <= 1) return 1 else return fib(n-1)+ fib(n-2) What is the cost? Let c(n) be number of calls Note c(n) =c(n-1)+c(n-2)+1 >= 2*c(n-2) –therefore c(n) >= 2 n/2 *c(1), ie. EXPONENTIAL Also c(n)<= 2*c(n-1) so –c(n) <= 2 n *c(1). This bounds growth rate. Is 3 n in O(2 n )?

8
Dynamic Fibonacci Recursion Fibonacci is bad Special fix Compute fib(1), fib(2)….up to fib(n) Reordering computation so subproblems computed and stored: example of dynamic programming. Easiest approach: use array –fib(n) –array[1] =array[2]=1 –for i = 3…n array[i] = array[i-1]+array[i-2] –return array[n] –note: you don’t need an array. This works for functions dependent on lesser values.

9
Combinations (k of n) How many different groups of size k and you form from n elements. Ans: n!/[(n-k)! * k!] Why? # of permutations of n people is n! # of permutations of those invited is k! # of permutations of those not invited is (n-k)! Compute it. DOES not compute! Dynamic does. Note: C(k,n) = C(k-1,n-1) +C(k,n-1) C(0,n) = 1 Use a 2-dim array. That’s dynamic programming

10
General Approach (caching) Recursion + hashing to store solution of subproblems Idea: –fib(n) –if (in hash table problem fib(n)) return value –else fib(n) = fib(n-1) + fib(n-2). How many calls will there be? How many distinct subproblems are there. Good: –don’t need to compute unneeded subproblems –simpler to understand Bad: overhead of hashing Not needed here, since subproblems obvious.

11
Recursive Binary Search Suppose T is a sorted binary tree –values stored at nodes Node –node less, more –int key –Object value Object find(int k) // assume k is in tree –if ( k == key) return value; –else if (k < key) return less.find(k) –else if (k >key) return more.find(k) What is the running time of this?

12
Analysis If maximum depth of tree is d, then O(d). What is maximum depth? –Worst case: n –What does tree look like? –Best case: log(n) –What will tree look like. So analysis tells us what we should achieve, i.e. a balanced tree. It doesn’t tell us how. Chapter 18 shows us how to create balanced trees.

13
Binary Search Solving f(x) = 0 where a 0 Solve( f, a, b) (abstract algorithm) –mid = (a+b)/2 –if -epsilon< f(mid) < epsilon return mid (*) –else if f(mid) <0 return f(mid,b); –else return f(a,mid) Note: epsilon is very small; real numbers are not dealt with exactly by computers Analysis: –Each call divides the domain in half. –Max calls is number of bits for real number –Note algorithm may not work on real computer -- this depends on how fast f changes.

14
Divide and Conquer: Merge Sort Let A be array of integers of length n define Sort (A)recursively via auxSort(A,0,N) where Define array[] Sort(A,low, high) –if (low == high) return –Else mid = (low+high)/2 temp1 = sort(A,low,mid) temp2 = sort(A,mid,high) temp3 = merge(temp1,temp2)

15
Merge Int[] Merge(int[] temp1, int[] temp2) –int[] temp = new int[ temp1.length+temp2.length] –int i,j,k –repeat if (temp1[i]

16
Analysis of Merge Sort Time –Let N be number of elements –Number of levels is O(logN) –At each level, O(N) work –Total is O(N*logN) –This is best possible for sorting. Space –At each level, O(N) temporary space –Space can be freed, but calls to new costly –Needs O(N) space –Bad - better to have an in place sort –Quick Sort (chapter 8) is the sort of choice.

17
General Recurrences Suppose T(N) satisfies – Assume N is a power of 2 (wlog) –T(N) = 2T(N/2)+N –T(1) = 1 Then T(N) = NLogN+N or T is O(NlogN) Proof –T(N)/N = T(N/2)/(N/2) +1 –T(N/2)/ (N/2) = T(N/4)/ (N/4) + 1 etc. –… –T(2)/2 = T(1)/1 + 1

18
Telescoping Sum Sum up all the equations and subtract like terms: T(N)/N = T(1)/1+ 1*(number of equations) Therefore T(N) = N +N*logN or O(N*log(N)) Note: The same proof shows that the recurrence S(N) = 2*S(N/2) is O(N). What happens if the recurrence is T(N) =2*T(N/2)+k? What happens if the recurrence is T(N)=3*T(N/3)..? General formula

19
General Recurrence Formulas Suppose –Divide into A subproblems –Each of relative size B (problem size/subproblem size) –k overhead is const*N k Then –if A>B k T(N) is O(N logB(A) ) –if A= B k T(N) is O(N k logN) –if A**
**

20
Dynamic Programming Smith-Waterman Algorithm has many variants Global SW algorithm finds the similarity between two strings. Goal: given costs for deleting and changing a character, determine the similarity of two strings. Main application: given two proteins (sequence of amino acids), how far apart are they? –Evolutionary trees –similar sequences define proteins with similar functions –can compare across species –other version deal with finding prototype/consensus for family of sequences

21
Smith - Waterman Example See Handout from Biological Sequence Analysis In this example, A,R,…V are the 20 amino acids Fig 2.2 defines how similar various amino acids are to one another Fig 2.5 shows the computation of the similarity of two sequences The deletion cost is 8. The bottom right hand corner gives the value of the similarity Tracing backwards computes the alignment.

22
SW subproblem merging c(i-1,j-1) c(i-1,j) c(i,j-1) c(i,j) Down: extend matching by deleting char si Right: extend match by deleting char tj Diagonal: extend match by substitution, if necessary

23
Smith-Waterman Algorithm Recursive definition: -- find the subproblems –Let s and t be the strings of lengths n and m. –Define f(i,j) as max similarity score of matching the initial i characters of s with initial j characters of t. – f(0,0) = 0 –For i>0 or j>0 let sim(s i,t j ) be similarity of char si and char tj. Let d be cost of deletion. f(i,j) = max {f(i-1,j) -d, f(i,j-1)-d, f(i-1,j-1) +sim(s i,t j )} –Analysis: O(n*m)

24
Matrix Review A matrix is 2-dim array of number, e.g. Int[2][3] Matrices represent linear transformations Every linear transformation is a matrix product of scale changes and rotations The determinant of a matrix is the change in volume of the unit cube. Matrices are used for data analysis and graphics If M1 has r1 rows and c1 columns and –M2 has r2 rows and c2 columns, and r2= c1 then M3 = M1*M2 has r1 rows and c2 columns where –M3[i][j]= (row i of M1) * (col j of M2) (dot product) In general M1*M2 =/= M2*M1 (not commutative) but (M1*M2)*M3 = M1*(M2*M3) (associative)

25
Multiplying Many Matrices Let Mi i=1..n be matrices with r i-1 rows and r i columns How many ways to multiply M1*M2…M*n? –exponential (involves Catalan numbers) But O(N 3 ) dynamic programming algorithm! Let mij = min cost of multiplying Mi*….*Mj Subproblem Idea: M i *… *M k * M k+1 *…*M j if i = j then mij = 0 if j>i then mij = min{ i<=k

26
Summary Recursion begat divide and conquer and dynamic programming General Flow –Subdivide problem –Store subsolutions –Merge subsolutions Beware of the costs: –cost of subdivision –cost of storage –cost of combining subsolution Can be very effective

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google