 Chapter 2: Algorithm Analysis

Presentation on theme: "Chapter 2: Algorithm Analysis"— Presentation transcript:

Chapter 2: Algorithm Analysis
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Chapter 2: Algorithm Analysis

Big-Oh and Other Notations in Algorithm Analysis
Classifying Functions by Their Asymptotic Growth Theta, Little oh, Little omega Big Oh, Big Omega Rules to manipulate Big-Oh expressions Typical Growth Rates

Classifying Functions by Their Asymptotic Growth
Asymptotic growth : The rate of growth of a function Given a particular differentiable function f(n), all other differentiable functions fall into three classes: .growing with the same rate .growing faster .growing slower asymptotic analysis is a method of describing limiting behavior

Theta f(n) and g(n) have same rate of growth, if
lim( f(n) / g(n) ) = c, 0 < c < ∞, n -> ∞ Notation: f(n) = Θ( g(n) ) pronounced "theta"

Little oh f(n) grows slower than g(n) (or g(n) grows faster than f(n))
if lim( f(n) / g(n) ) = 0, n → ∞ Notation: f(n) = o( g(n) ) pronounced "little oh"

Little omega f(n) grows faster than g(n)
(or g(n) grows slower than f(n)) if lim( f(n) / g(n) ) = ∞, n -> ∞ Notation: f(n) = ω (g(n)) pronounced "little omega"

Little omega and Little oh
if g(n) = o( f(n) ) then f(n) = ω( g(n) ) Examples: Compare n and n2 lim( n/n2 ) = 0, n → ∞, n = o(n2) lim( n2/n ) = ∞, n → ∞, n2 = ω(n)

Theta: Relation of Equivalence
R: "having the same rate of growth": relation of equivalence, gives a partition over the set of all differentiable functions - classes of equivalence. Functions in one and the same class are equivalent with respect to their growth.

Algorithms with Same Complexity
Two algorithms have same complexity, if the functions representing the number of operations have same rate of growth. Among all functions with same rate of growth we choose the simplest one to represent the complexity.

Examples Compare n and (n+1)/2 lim( n / ((n+1)/2 )) = 2,
same rate of growth (n+1)/2 = Θ(n) - rate of growth of a linear function

Examples Compare n2 and n2+ 6n lim( n2 / (n2+ 6n ) )= 1
same rate of growth. n2+6n = Θ(n2) rate of growth of a quadratic function

Examples Compare log n and log n2 lim( log n / log n2 ) = 1/2
same rate of growth. log n2 = Θ(log n) logarithmic rate of growth

Examples Θ(n3): n3 5n3+ 4n 105n3+ 4n2 + 6n Θ(n2): n2 5n2+ 4n + 6
Θ(log n): log n log n2 log (n + n3)

Comparing Functions same rate of growth: g(n) = Θ(f(n))
different rate of growth: either g(n) = o (f(n)) g(n) grows slower than f(n), and hence f(n) = ω(g(n)) or g(n) = ω (f(n)) g(n) grows faster than f(n), and hence f(n) = o(g(n))

same rate or slower than g(n). f(n) = Θ(g(n)) or f(n) = o(g(n))
The Big-Oh Notation f(n) = O(g(n)) if f(n) grows with same rate or slower than g(n). f(n) = Θ(g(n)) or f(n) = o(g(n))

the closest estimation: n+5 = Θ(n) the general practice is to use
Example n+5 = Θ(n) = O(n) = O(n2) = O(n3) = O(n5) the closest estimation: n+5 = Θ(n) the general practice is to use the Big-Oh notation: n+5 = O(n)

The Big-Omega Notation
The inverse of Big-Oh is Ω If g(n) = O(f(n)), then f(n) = Ω (g(n)) f(n) grows faster or with the same rate as g(n): f(n) = Ω (g(n))

Rules to manipulate Big-Oh expressions
a. If T1(N) = O(f(N)) and T2(N) = O(g(N)) then T1(N) + T2(N) = max( O( f (N) ), O( g(N) ) )

Rules to manipulate Big-Oh expressions
b. If T1(N) = O( f(N) ) and T2(N) = O( g(N) ) then T1(N) * T2(N) = O( f(N)* g(N) )

Rules to manipulate Big-Oh expressions
If T(N) is a polynomial of degree k, then T(N) = Θ( Nk ) Rule 3: log k N = O(N) for any constant k.

Examples n2 + n = O(n2) we disregard any lower-order term
nlog(n) = O(nlog(n)) n2 + nlog(n) = O(n2)

C constant, we write O(1) logN logarithmic log2N log-squared N linear
Typical Growth Rates C constant, we write O(1) logN logarithmic log2N log-squared N linear NlogN N2 quadratic N3 cubic 2N exponential N! factorial

Exercise True or False N2 = O(N2) 2N = O(N2) N = O(N2) N2 = O(N)
T, T, T, F, T, T

Exercise True or False N2 = Θ (N2) 2N = Θ (N2) N = Θ (N2) N2 = Θ (N)
T, F, F, F, T, T

Running Time Calculations
The work done by an algorithm, i.e. its complexity, is determined by the number of the basic operations necessary to solve the problem.

The Task Determine how the number of operations depend on the size of input : N - size of input F(N) - number of operations

Basic operations in an algorithm
Problem: Find x in an array Operation: Comparison of x with an entry in the array Size of input: The number of the elements in the array

Problem: Multiplying two matrices with real entries Operation:
Basic operations …. Problem: Multiplying two matrices with real entries Operation: Multiplication of two real numbers Size of input: The dimensions of the matrices

Problem: Sort an array of numbers
Basic operations …. Problem: Sort an array of numbers Operation: Comparison of two array entries plus moving elements in the array Size of input: The number of elements in the array

Counting the number of operations
A. for loops O(n) The running time of a for loop is at most the running time of the statements inside the loop times the number of iterations.

for loops sum = 0; for( i = 0; i < n; i++ ) sum = sum + i;
The running time is O(n)

Counting the number of operations
B. Nested loops The total running time is the running time of the inside statements times the product of the sizes of all the loops

Nested loops The running time is O(n2)
sum = 0; for( i = 0; i < n; i++) for( j = 0; j < n; j++) sum++; The running time is O(n2)

Counting the number of operations
C. Consecutive program fragments Total running time : the maximum of the running time of the individual fragments

Consecutive program fragments
sum = 0; O(n) for( i = 0; i < n; i++) sum = sum + i; sum = 0; O(n2) for( i = 0; i < n; i++) for( j = 0; j < 2n; j++) sum++; The maximum is O(n2)

Counting the number of operations
D: If statement if C S1; else S2; The running time is the maximum of the running times of S1 and S2.

EXAMPLES what is the number of operations?
sum = 0; for( i = 0 ; i < n; i++) for( j = 0 ; j < n*n ; j++ ) sum++; O(n3)

EXAMPLES what is the number of operations?
sum = 0; for( i = 0; i < n ; i++) for( j = 0; j < i ; j++) sum++; O(n2)

EXAMPLES what is the number of operations?
for(j = 0; j < n*n; j++) compute_val(j); The complexity of compute_val(x) is given to be O(n*logn) O(n3logn)

Search in an unordered array of elements
for (i = 0; i < n; i++) if (a[ i ] == x) return 1; return -1; O(n)

if (a[ i ][ j ] == x) return 1 ; return -1;
Search in a table n x m for (i = 0; i < n; i++) for (j = 0; j < m; j++)   if (a[ i ][ j ] == x) return 1 ; return -1; O(n*m)

Max Subsequence Problem
Given a sequence of integers A1, A2, …, An, find the maximum possible value of a subsequence Ai, …, Aj. Numbers can be negative. You want a contiguous chunk with largest sum. Example: -2, 11, -4, 13, -5, -2 The answer is (subseq. A2 through A4). We will discuss 4 different algorithms, with time complexities O(n3), O(n2), O(n log n), and O(n). With n = 106, algorithm 1 may take > 10 years; algorithm 4 will take a fraction of a second! 42

Algorithm 1 for Max Subsequence Sum
Given A1,…,An , find the maximum value of Ai+Ai+1+···+Aj 0 if the max value is negative int maxSum = 0; for( int i = 0; i < a.size( ); i++ ) for( int j = i; j < a.size( ); j++ ) { int thisSum = 0; for( int k = i; k <= j; k++ ) thisSum += a[ k ]; if( thisSum > maxSum ) maxSum = thisSum; } return maxSum; Time complexity: O(n3) 43

Algorithm 2 Idea: Given sum from i to j-1, we can compute the sum from i to j in constant time. This eliminates one nested loop, and reduces the running time to O(n2). into maxSum = 0; for( int i = 0; i < a.size( ); i++ ) int thisSum = 0; for( int j = i; j < a.size( ); j++ ) { thisSum += a[ j ]; if( thisSum > maxSum ) maxSum = thisSum; } return maxSum; 44

Algorithm 3 This algorithm uses divide-and-conquer paradigm.
Suppose we split the input sequence at midpoint. The max subsequence is entirely in the left half, entirely in the right half, or it straddles the midpoint. Example: left half | right half | Max in left is 6 (A1 through A3); max in right is 8 (A6 through A7). But straddling max is 11 (A1 thru A7). 45

Algorithm 3 (cont.) Example: left half | right half
| Max subsequences in each half found by recursion. How do we find the straddling max subsequence? Key Observation: Left half of the straddling sequence is the max subsequence ending with -2. Right half is the max subsequence beginning with -1. A linear scan lets us compute these in O(n) time. 46

Algorithm 3: Analysis The divide and conquer is best analyzed through recurrence: T(1) = 1 T(n) = 2T(n/2) + O(n) This recurrence solves to T(n) = O(n log n). 47

Algorithm 4 Time complexity clearly O(n)
2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2 int maxSum = 0, thisSum = 0; for( int j = 0; j < a.size( ); j++ ) { thisSum += a[ j ]; if ( thisSum > maxSum ) maxSum = thisSum; else if ( thisSum < 0 ) thisSum = 0; } return maxSum; Time complexity clearly O(n) But why does it work? I.e. proof of correctness. 48

Proof of Correctness Max subsequence cannot start or end at a negative Ai. More generally, the max subsequence cannot have a prefix with a negative sum. Ex: Thus, if we ever find that Ai through Aj sums to < 0, then we can advance i to j+1 Proof. Suppose j is the first index after i when the sum becomes < 0 The max subsequence cannot start at any p between i and j. Because Ai through Ap-1 is positive, so starting at i would have been even better. 49

Algorithm 4 int maxSum = 0, thisSum = 0; for( int j = 0; j < a.size( ); j++ ) { thisSum += a[ j ]; if ( thisSum > maxSum ) maxSum = thisSum; else if ( thisSum < 0 ) thisSum = 0; } return maxSum The algorithm resets whenever prefix is < 0. Otherwise, it forms new sums and updates maxSum in one pass. 50

Why Efficient Algorithms Matter
Suppose N = 106 A PC can read/process N records in 1 sec. But if some algorithm does N*N computation, then it takes 1M seconds = 11 days!!! 100 City Traveling Salesman Problem. A supercomputer checking 100 billion tours/sec still requires years! Fast factoring algorithms can break encryption schemes. Algorithms research determines what is safe code length. (> 100 digits) 51

How to Measure Algorithm Performance
What metric should be used to judge algorithms? Length of the program (lines of code) Ease of programming (bugs, maintenance) Memory required Running time Running time is the dominant standard. Quantifiable and easy to compare Often the critical bottleneck 52

Logarithms in Running Time
Binary search Euclid’s algorithm Exponentials Rules to count operations

Divide-and-conquer algorithms
Subsequently reducing the problem by a factor of two require O(logN) operations

Why logN? A complete binary tree with N leaves has logN levels.
Each level in the divide-and- conquer algorithms corresponds to an operation Hence the number of operations is O(logN)

Example: 8 leaves, 3 levels

Binary Search Solution 1:
Scan all elements from left to right, each time comparing with X. O(N) operations.

Binary Search Solution 2: O(logN)
Find the middle element Amid in the list and compare it with X If they are equal, stop If X < Amid consider the left part If X > Amid consider the right part Do until the list is reduced to one element

Euclid's algorithm Finding the greatest common divisor (GCD)
GCD of M and N, M > N, = GCD of N and M % N

GCD and recursion Recursion: If M%N = 0 return N
Else return GCD(N, M%N) The answer is the last nonzero remainder.

M N rem

Euclid’s Algorithm long gcd ( long m, long n) { long rem;
while (n != 0) rem = m % n; m = n; n = rem; } return m; } Euclid’s Algorithm (non-recursive implementation)

Why O(logN) M % N <= M / 2 Hence O(logN)
After 1st iteration N appears as first argument, the remainder is less than N/2 After 2nd iteration the remainder appears as first argument and will be reduced by a factor of two Hence O(logN)

Computing XN XN = X*(X2) N / 2 ,N is odd XN = (X2) N / ,N is even

long pow (long x, int n) { if ( n == 0) return 1; if (is_Even( n )) return pow(x * x, n/2); else return x * pow ( x * x, n/2); }

Why O(LogN) If N is odd : two multiplications
The operations are at most 2logN: O(logN)

Another recursion for XN
Another recursive definition that reduces the power just by 1: XN = X*XN -1 Here the operations are N-1, i.e. O(N) and the algorithm is less efficient than the divide-and-conquer algorithm.

How to count operations
single statements (not function calls) : constant O(1) = 1. sequential fragments: the maximum of the operations of each fragment

How to count operations
single loop running up to N, with single statements in its body: O(N) single loop running up to N, with the number of operations in the body O(f(N)): O( N * f(N) )

How to count operations
two nested loops each running up to N, with single statements: O(N2) divide-and-conquer algorithms with input size N: O(logN) Or O(N*logN) if each step requires additional processing of N elements

Example: What is the probability two numbers to be relatively prime?
tot = 0; rel = 0; for ( i = 0; i <= n; i++) for (j = i+1; j <= n; j++) { tot++; if ( gcd( i, j ) ==1) rel++; } return (rel/tot); Running time = ?