Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 231 1 Devon M. Simmonds University of North Carolina, Wilmington CSC231 Data Structures.

Similar presentations


Presentation on theme: "CSC 231 1 Devon M. Simmonds University of North Carolina, Wilmington CSC231 Data Structures."— Presentation transcript:

1 CSC 231 1 Devon M. Simmonds University of North Carolina, Wilmington CSC231 Data Structures

2 CSC 231 2 Outline Analysis Running time Worst case and average case analysis Asymptotic notation Big O, etc.

3 CSC 231 Expressing good solutions to complex software-related problems require a good grasp of algorithms and data structures. 3 Program = Data Structures + Algorithms Computer Science = solving problems using computer programs

4 CSC 231 What are algorithms? Instructions for accessing and manipulating our data structures. Order of access How data should be modified Efficiency of access Etc. 4

5 CSC 231 Properties of Algorithms Finite input. Finite output. Correctness. A correct algorithm halts with the correct output for every input. Efficiency/Complexity How much resources the algorithm requires. time Space bandwidth 5

6 CSC 231 Algorithm Analysis We only analyze correct algorithms Incorrect algorithms might not halt at all on some input instances Incorrect algorithms might halt with other than the desired answer Analyzing an algorithm involves: Predicting the resources that the algorithm requires Time – computational time (usually the most important) Space - memory Bandwidth

7 CSC 231 Running Time? The running time of an algorithm on a particular input is the number of primitive operations executed. 7

8 CSC 231 Algorithm Analysis… Factors affecting the running time computer compiler algorithm used input to the algorithm The content of the input affects the running time typically, the input size (number of items in the input) is the main consideration E.g. sorting problem  the number of items to be sorted E.g. multiply two matrices together  the total number of elements in the two matrices Machine model assumed Instructions are executed one after another, with no concurrent operations  Not parallel computers

9 CSC 231 Example Calculate Lines 1, 4 and 5 count for one unit each Line 3: executed N times, each time four units (2 mult, 1 add, I asmt) Line 2: (1 for initialization, N+1 for all the tests, N for all the increments) total 2N + 2 total cost: 6N + 4 def sum(N): partialSum = 0 for i in range (1, N+1): partialSum += i*i*i return partialSum def main(): print(sum(3)) main() 1 2N+2 4N 1 1234512345

10 CSC 231 Worst- / average- / best-case Worst-case running time of an algorithm The longest running time for any input of size n An upper bound on the running time for any input  guarantee that the algorithm will never take longer The worst case can occur fairly often E.g. in searching a database for a particular piece of information – at the end of the list Best-case running time E.g. in searching a database for a particular piece of information – at the beginning of the list Average-case running time May be difficult to define what “average” means

11 CSC 231 11 Why we do not use Experimental Studies Write a program implementing the algorithm Run the program with inputs of varying size and composition Use a method like time.clock() to get an accurate measure of the actual running time Plot the results

12 CSC 231 12 Limitations of Experiments It is necessary to implement the algorithm, which may be difficult. Results may not be indicative of the running time on other inputs not included in the experiment. In order to compare two algorithms, the same hardware and software environments must be used Many factors may affect the results. Poor memory use (i.e. cache misses, page faults) can give misleading results Platform dependence?

13 CSC 231 13 Theoretical Analysis Uses a high-level description of the algorithm instead of an implementation Characterizes running time as a function of the input size, n. Takes into account all possible inputs Allows us to evaluate the speed of an algorithm independent of the hardware/software environment

14 CSC 231 14 Pseudocode High-level description of an algorithm More structured than English prose Less detailed than a program Preferred notation for describing algorithms Hides many program design issues Algorithm arrayMax(A, n) Input array A of n integers Output maximum element of A currentMax  A[0] for i  1 to n  1 do if A[i]  currentMax then currentMax  A[i] return currentMax Example: find max element of an array

15 CSC 231 15 Primitive Operations Basic computations performed by an algorithm Identifiable in pseudocode Largely independent from the programming language Assumed to take a constant amount of time in the RAM model Examples: Evaluating an expression Assigning a value to a variable Indexing into an array Calling a method Returning from a method

16 CSC 231 16 The Random Access Machine (RAM) Model A CPU An potentially unbounded bank of memory cells, each of which can hold an arbitrary number or character 0 1 2 Instructions are executed one after another, no concurrent operations. Memory cells are numbered and accessing any cell in memory takes unit time.

17 CSC 231 Running Time? The running time of an algorithm on a particular input is the number of primitive operations executed. 17

18 CSC 231 18 Counting Primitive Operations By inspecting the pseudocode/program, we can determine the maximum number of primitive operations executed by an algorithm, as a function of the input size Algorithm arrayMax(A, n) # operations currentMax  A[0]2 for( i  1; i <= n  1; i++ )loop 2n times if A[i]  currentMax then2 ops (index, comp) currentMax  A[i]2 ops (index, asmt) return currentMax 1 4n + 1 < Total < 6n - 1

19 CSC 231 19 Counting Primitive Operations Algorithm arrayMax(A, n) # operations currentMax  A[0]2 for( i  1; i <= n  1; i++ )loop 2n times if A[i]  currentMax then2 ops (index, comp) currentMax  A[i]2 ops (index, asmt) return currentMax 1 1 st line = 2 ops, last line = 1 op 2 nd line = 2n ops (init = 1, incr = n-1, cond = n). If statement is either 2 ops or 4 ops * (n-1) times 2 + 2n + 2(n-1) + 1 < Total < 2 + 2n + 4(n-1) + 1 4n + 1 < Total < 6n - 1

20 CSC 231 20 Estimating Running Time Algorithm arrayMax executes 6n  1 primitive operations in the worst case. Define: a = Time taken by the fastest primitive operation b = Time taken by the slowest primitive operation Let T(n) be worst-case time of arrayMax. Then T(n)  b(6n  1) Hence, the running time T(n) is bounded by a linear function Actual running time can be between a (4n  1) and b (6n  1)

21 CSC 231 Rules for Counting Primitive Operations Rule-1 loops running time of loop <= #iterations * running time of statements in loop 21 Algorithm arrayMax(A, n) # operations currentMax  A[0] 2 for i  1 to n  1 doloop n  1 times if A[i]  currentMax then2 ops (index, comp) currentMax  A[i]2 ops (index, asmt) { increment counter i }2 ops (cond, incr) ? return currentMax1 op (incr) ? loop <= 4(n  1)

22 CSC 231 Rules for Counting Primitive Operations Rule-2 nested loops Evaluate nested loops inside out. running time of nested loop <= product of sizes of loops * running time of statements 22 for i  1 to n  1 do loop n  1 times for i  1 to m  1 doloop m  1 times if A[i]  currentMax then2 ops (index, comp) currentMax  A[i]2 ops (index, asmt) worst case running time = 4(n  1) (m  1)

23 CSC 231 Rules for Counting Primitive Operations Rule-3 consecutive statements Add consecutive statements. 23 for i  1 to n  1 do loop n  1 times for i  1 to m  1 doloop m  1 times A[i] = currentMax 2 ops (index, comp) currentMax++1 op (comp) worst case running time = (2 + 1) worst case running time = (n  1) (m  1)(2 + 1)

24 CSC 231 Rules for Counting Primitive Operations Rule-4 if/else statements Running time <= running time of condition + larger of two choices. 24 for i  1 to n  1 do loop n  1 times for i  1 to m  1 do loop m  1 times if (A[i]  currentMax && i > n) then 4 ops (index, 3 comps) currentMax  A[i] 2 ops (index, asmt) then currentMax  A[i] + 2 3 ops (index, comp, asmt) worst case running time <= (4+3)(n  1) (m  1)

25 CSC 231 Rules for Counting Primitive Operations Add consecutive statements. Running time of if-statement <= running time of condition + larger of two choices. Running time of loop <= #iterations * running time of statements in loop Evaluate nested loops inside out. running time of nested loop <= product of sizes of loops * running time of statements 25

26 CSC 231 Running-time of algorithms Bounds are for the algorithms, rather than programs programs are just implementations of an algorithm, and almost always the details of the program do not affect the bounds Bounds are for algorithms, rather than problems A problem can be solved with several algorithms, some are more efficient than others

27 CSC 231 27 Growth Rate of Running Time How does the running time of function T(n) change as input size increases? Changing the hardware/ software environment Affects T(n) by a constant factor, but Does not alter the growth rate of T(n).

28 CSC 231 28 Growth Rates Growth rates of functions: Linear  n Quadratic  n 2 Cubic  n 3 In a log-log chart, the slope of the line corresponds to the growth rate of the function

29 CSC 231 29 Constant Factors The growth rate is not affected by constant factors or lower-order terms Examples 10 2 n  10 5 is a linear function 10 5 n 2  10 8 n is a quadratic function

30 CSC 231 30 Asymptotic Algorithm Analysis Analysis for input sizes large enough to make only the order of growth of running times relevant. The asymptotic analysis of an algorithm determines the running time in big-Oh notation To perform the asymptotic analysis We find the worst-case number of primitive operations executed as a function of the input size We express this function with big-Oh notation

31 CSC 231 31 Asymptotic Notation Big-O f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega f(n) is  (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta f(n) is  (g(n)) if f(n) is asymptotically equal to g(n) little-o f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n) little-omega f(n) is  (g(n)) if is asymptotically strictly greater than g(n)

32 CSC 231 Asymptotic notation: Big-O f(N) = O(g(N)) iff there are positive constants c and n 0 such that f(N)  c g(N) when N  n 0 The growth rate of f(N) is less than or equal to the growth rate of g(N) g(N) is an upper bound on f(N)

33 CSC 231 Big-O Growth Rate f(N) = O(g(N)) iff there are positive constants c and n 0 such that f(N)  c g(N) when N  n 0 The growth rate of f(N) is less than or equal to the growth rate of g(N) g(N) is an upper bound on f(N) f(N) grows no faster than g(N) for “large” N

34 CSC 231 Big-O: example Let f(N) = 2N 2. Then f(N) = O(N 4 ) f(N) = O(N 3 ) f(N) = O(N 2 ) (best answer, asymptotically tight) O(N 2 ): reads “order N-squared” or “Big-O N-squared”

35 CSC 231 Big O: more examples N 2 / 2 – 3N = O(N 2 ) 1 + 4N = O(N) 7N 2 + 10N + 3 = O(N 2 ) = O(N 3 ) log 10 N = log 2 N / log 2 10 = O(log 2 N) = O(log N) sin N = O(1); 10 = O(1), 10 10 = O(1) log N + N = O(?) log k N = O(N) for any constant k N = O(2 N ), but 2 N is not O(N) 2 10N is not O(2 N )

36 CSC 231 36 Big-O and Growth Rate The big-Oh notation gives an upper bound on the growth rate of a function The statement “ f(n) is O(g(n)) ” means that the growth rate of f(n) is no more than the growth rate of g(n) We can use the big-Oh notation to rank functions according to their growth rate f(n) is O(g(n))g(n) is O(f(n)) g(n) grows moreYesNo f(n) grows moreNoYes Same growthYes

37 CSC 231 37 Big-O Rules If is f(n) a polynomial of degree d, then f(n) is O(n d ), i.e., 1. Drop lower-order terms 2. Drop constant factors Use the smallest possible class of functions Say “ 2n is O(n) ” instead of “ 2n is O(n 2 ) ” Use the simplest expression of the class Say “ 3n  5 is O(n) ” instead of “ 3n  5 is O(3n) ”

38 CSC 231 Some rules Ignore the lower order terms and the coefficients of the highest-order term No need to specify the base of logarithm Changing the base from one constant to another changes the value of the logarithm by only a constant factor If T 1 (N) = O(f(N) and T 2 (N) = O(g(N)), then T 1 (N) + T 2 (N) = max(O(f(N)), O(g(N))), T 1 (N) * T 2 (N) = O(f(N) * g(N))

39 CSC 231 Some typical growth rates c - constant time log n - logarithmic log 2 n – log-squared n – linear n log n n 2 – quadratic n 3 - cubic a n – exponential 39

40 CSC 231 40 Summations Logarithms and Exponents Proof techniques Proof by contradiction Proof by induction Math you need to Review

41 CSC 231 Math Review: logarithmic functions

42 CSC 231 42 properties of exponentials: a (b+c) = a b a c a bc = (a b ) c a b /a c = a (b-c) b = a log a b b c = a c*log a b properties of logarithms: log b (xy) = log b x + log b y log b (x/y) = log b x - log b y log b x a = alog b x log b a = log x a/log x b Math you need to Review

43 CSC 231 43 Computing Prefix Averages Asymptotic analysis with two algorithms for prefix averages The i -th prefix average of an array X is average of the first (i  1) elements of X: A[i]  X[0]  X[1]  …  X[i])/(i+1) Computing the array A of prefix averages of another array X has applications to financial analysis

44 CSC 231 44 A Prefix Averages Algorithm Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X appx. #operations A  new array of n integers for i  0 to n  1 do s  X[0] for j  1 to i do s  s  X[j] A[i]  s  (i  1) return A Algorithm prefixAverages2 runs in O(?) time

45 CSC 231 45 One Way To Think About This You go through the outer loop n times Each time through the loop you have to perform an assignment, and then another loop. The inner loop first does no iterations, then 1, then 2, etc. Each iteration of the inner loop costs a constant amount. So, real cost of this is like

46 CSC 231 46 Another Prefix Averages Algorithm Algorithm prefixAverages2(X, n) Input array X of n integers Output array A of prefix averages of X #operations A  new array of n integers s  0 for i  0 to n  1 do s  s  X[i] A[i]  s  (i  1) return A Algorithm prefixAverages2 runs in O(?) time

47 CSC 231 47 Summary Big-O f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega f(n) is  (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta f(n) is  (g(n)) if f(n) is asymptotically equal to g(n) little-oh f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n) little-omega f(n) is  (g(n)) if is asymptotically strictly greater than g(n)

48 CSC 231 48 ______________________ Devon M. Simmonds Computer Science Department University of North Carolina Wilmington _____________________________________________________________ Qu es ti ons? Reading from course text:

49 CSC 231 49 Relatives of Big-Oh big-Omega f(n) is  (g(n)) if there is a constant c > 0 and an integer constant n 0  1 such that f(n)  cg(n) for n  n 0 big-Theta f(n) is  (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n 0  1 such that c’g(n)  f(n)  c’’g(n) for n  n 0 little-oh f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant n 0  0 such that f(n)  cg(n) for n  n 0 little-omega f(n) is  (g(n)) if, for any constant c > 0, there is an integer constant n 0  0 such that f(n)  cg(n) for n  n 0

50 CSC 231 50 Devon M. Simmonds University of North Carolina, Wilmington TIME: Tuesday/Thursday 11:11:50am in 1012 & Thursday 3:30-5:10pm in 2006. Office hours: TR 1-2pm or by appointment. Office location: CI2046. Email: simmondsd[@]uncw.edu Course Overview


Download ppt "CSC 231 1 Devon M. Simmonds University of North Carolina, Wilmington CSC231 Data Structures."

Similar presentations


Ads by Google