 # CS201: PART 1 Data Structures & Algorithms S. Kondakcı.

## Presentation on theme: "CS201: PART 1 Data Structures & Algorithms S. Kondakcı."— Presentation transcript:

CS201: PART 1 Data Structures & Algorithms S. Kondakcı

Analysis of Algorithms2 Algorithm InputOutput An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Theoretical Analysis of Algorithms: Uses a high-level description of the algorithm instead of an implementation Characterizes running time as a function of the input size n. Takes into account all possible inputs Allows us to evaluate the speed of any design independent of its implementation.

Analysis of Algorithms3 Program Efficiency Program efficiency: is a measure of the amount of resources required to produce desired results. Efficiency Aspects: 1) What are the important resources we should try to optimize? 2) Where are the important efficiency gains to be made? 3) How important is efficiency in the first place?

Analysis of Algorithms4 Efficiency Today User Efficiency. The amount of time and effort users will spend to learn how o use the program, how to prepare the data, how to configure and customize the program, and how to interpret and use the output. Maintenance Efficiency. The amount of time and effort maintenance group will spend reading a program and its technical documentation in order to understand it well enough to make any necessary modifications. Algorithmic Complexity. The inherent efficiency of the method itself, regardless of which machine we run it on or how we code it. Coding Efficiency. This is the traditional efficiency measure. Here we are concerned with how much processor time and memory space a computer program requires to produce desired results. Coding efficiency is the key step towards optimal usage of machine resources.

Analysis of Algorithms5 Programmer’s Duty Programmers should should keep these in mind: 1.Correct, robust, and reliable. 2.Easy to use for its intended end-user group. 3.Easy to understand and easy to modify. 4.Portable. 5.Consistency in Input/Output behavior. 6.User documentation.

Analysis of Algorithms6 Optimization Optimization on CPU-Time: Consider a network security assessment tool as a real-time application. The application works like a security scanner protocol designed to audit, monitor, and correct all aspects of network security. Real- time processing of the intercepted network packets containing inspection information requires faster data processing. Besides, such a process should generate some auditing information. Optimization on Memory: Developing programs that do not fit into the memory space available on your systems is often quite a bit demanding. Kernel level processing of the network packets requires kernel memory optimization and a powerful and failsafe memory management capability. Providing Run-time Continuity: Extensive machine-level optimization is a major requirement for continuously running programs, such as the security scanner daemons. Reliability and Correctness: One of the inevitable efficiency requirements is the absolute reliability. The second important efficiency factor is correctness. That is, your program should do exactly what it is supposed to do. Choosing and implementing a reliable inspection methodology should be done with precision. Optimization on Programmer’s Time: How efficient a programmer works depends on the choice of team policy and developmen tool selection.

Analysis of Algorithms7 Coding Efficiency: Unstructured Code /Efficient Programming/S. Kondakci-1999

Analysis of Algorithms8 Coding Efficiency: Structured Code /Efficient Programming/S. Kondakci-1999

Analysis of Algorithms9 Protecting Against Run-time Errors Illegal pointer operations. Array subscript out of bound. Endless loops may cause stacks grow into the heap area. Presentational errors, such as network byte order, number conversions, division by zero, undefined results, e.g., tan(90) = undefined. Trying to write over the kernel’s text area, or the data area. Referencing objects declared as prototype but not defined. Performing operations on a pointer pointing at NULL. Operating system weaknesses.

Analysis of Algorithms10 Assertions A general pitfall: making assumptions that turn out not to be justified. Most of the mistakes arise from simply misunderstanding the interaction between various pieces of code The assertion rule states that you should always express yourself boldly or forcefully of the fact that there are some other things that you have not covered clear enough yet. Any assumptions you make in writing your programs should be documented somewhere in the code itself, particularly if you know or expect the assumption to be false in other environments.

Analysis of Algorithms11 Does the Machine Understand Your Assumptions? Remember those assumptions are yours: They should be presented to the machine by any means that you are supposed to provide in your code. The machine will not be able to check your assumptions. This is simply a matter of including explicit checks in your code, even for things that “cannot happen”. if (p == NULL) panic( “ Driver routine: p is NULL\n ” ); if (p->p_flags & BUSY); /* Safe to continue */ … ASSERT(p !=NULL); If (p->p_flags & BUSY); /* Safe to continue */ …

Analysis of Algorithms12 Guidelines for the implementation 1. Protect input parameters using call-by-value. 2. Avoid global variables and functions with side effects. 3. Make all temporary variables local to functions where they are used. 4. Never halt or sleep in a function. Spawn a dedicated function if necessary. 5. Avoid producing output within a function unless the sole purpose of the function is output. 6. Where appropriate use return values to return the status of function calls. 7. Avoid confusing programming tricks. 8. Always strive for simplicity and clarity. Never sacrifice clarity of expression for cleverness of expression. 9. If any keep your assertions local to your code. 10. Never sacrifice clarity of expression for minor reductions in execution time.

Analysis of Algorithms13 Debugging and Tracing Making use of the preprocessor can allow you to incorporate many debugging aids in your module, for instance, the driver module. Later, in the production version these debugging aids can be removed. #ifdef DEBUG #define TRACE_OPEN (debugging && 0x01) #define TRACE_CLOSE (debugging && 0x02) #define TRACE_READ (debugging && 0x04) #define TRACE_WRITE (debugging && 0x08) int debugging = -1; /* enable all traces output */ #else #define TRACE_OPEN 0 #define TRACE_CLOSE 0 #define TRACE_READ 0 #define TRACE_WRITE 0 #endif...

Analysis of Algorithms14 Tracing: Later in the Program if (TRACE_READ) printf(‘’Device driver read, Packet number (%d) \n’’,pack_no); … Later, in the code the output of the trace information can be done by a manner similar to this:

Analysis of Algorithms15 Checking Programs With lint (Unix) The lint utility is intended to verify some facets of a C program, such as its potential portability. lint derives from the idea of picking the “fluff” out of a C program. It does this, by advising on C constructs (including functions) and usage which might turn out to be ‘bugs’, portability problems, inconsistent declarations, bad function and argument types, or dead code. See the manual section lint(1) for further explanations.

Analysis of Algorithms16 Now, Lint’ing \$ lint –hxa mytest.c (8) warning: loop not entered at top (8) warning: constant in conditional context variable unused in function (3) z in main implicitly declared to return int (10) printf declaration unused in block (5) duble function returns value, which is always ignored printf

Analysis of Algorithms17 Test Coverage Analysis Yet another tool born for execution tracing and analysis of programs called tcov, it can be used to trace and analyze a source code to report a coverage test. tcov does this by analysing the source code step-by-step. The extra code is generated by giving the –xa option to the compiler command, i.e., \$ gcc -xa -o src src.c The –xa option invokes a runtime recording mechanism that creates a.d file for every.c file. The.d file accumulates execution data for the corresponding source file. The tcov utility can then be run on the source file to generate statistics about the program. The following example source file, getmygid.c, is analysed as: \$ cc -xa -o getmygid getmygid.c \$ tcov -a getmygid.c \$ ls –l getmy???* -rwxr-xr-x 1 staff 25120 Feb 11 12:07 getmygid -rw------- 1 staff 519 Sep 9 1994 getmygid.c -rw-r--r-- 1 staff 9 Feb 11 12:07 getmygid.d -rw-r--r-- 1 staff 1025 Feb 11 12:08 getmygid.tcov

Analysis of Algorithms18 Example: getmygid.c \$ cat getmygid.c #include char *msg = "I am sorry I cannot tell you everything" ; int gid,egid; int uid,euid, pid,ppid, i; int main() { gid = getgid(); if (gid >= 0) printf("1- My GID is: %d\n", gid); egid = getegid(); if (egid >=0 ) printf("2- My EGID is: %d\n", egid); uid = getuid(); if ( uid >=0) printf("3- My uid is: %d\n", uid); euid = geteuid(); if (euid >= 0) printf("4- My Euid is: %d\n", euid); pid = getpid(); if ( pid >=0 ) printf("5- My pid is: %d\n", pid); ppid = getppid(); if ( ppid >= 0) printf("6- My ppid is: %d\n", ppid); prt_msg("We came to end!!!"); return 0; prt_msg(msg); } prt_msg(char *mesg){ printf("%s \n", mesg); }

Analysis of Algorithms19 Tcov’ing getmygid.c \$ cat getmygid.tcov ##### -> #include ##### -> char *msg = "I am sorry I cannot tell you everything" ; ##### -> ##### -> int gid,egid; ##### -> int uid,euid, pid,ppid, i; ##### -> int main() ##### -> { 2 -> gid = getgid(); 2 -> if (gid >= 0) printf("1- My GID is: %d\n", gid); 2 -> egid = getegid(); 2 -> if (egid >=0 ) printf("2- My EGID is: %d\n", egid); 2 -> uid = getuid(); 2 -> if ( uid >=0) printf("3- My uid is: %d\n", uid); 2 -> euid = geteuid(); 2 -> if (euid >= 0) printf("4- My Euid is: %d\n", euid); 2 -> pid = getpid(); 2 -> if ( pid >=0 ) printf("5- My pid is: %d\n", pid); 2 -> ppid = getppid(); 2 -> if ( ppid >= 0) printf("6- My ppid is: %d\n", ppid); 2 -> prt_msg("We came to end!!!"); 2 -> return 0; 2 -> prt_msg(msg); 2 -> } 2 -> prt_msg(mesg) 2 -> char *mesg; 2 -> { 2 -> printf("%s \n", mesg); 2 -> }

Analysis of Algorithms20 Tcov’ing getmygid.c Top 10 Blocks Line Count 9 2 11 2 13 2 15 2 17 2 19 2 21 2 29 2 8 Basic blocks in this file 8 Basic blocks executed 100.00 Percent of the file executed 16 Total basic block executions 2.00 Average executions per basic block As shown, tcov(1) generates an annotated listing of the source file (getmygid.tcov), where each line is prefixed with a number indicating the count of execution of each statement on the line. Finally per line and per block statistics are shown.

Analysis of Algorithms21 Have nice break!

Analysis of Algorithms Algorithm Input Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of time.

Analysis of Algorithms23 Running Time Most algorithms transform input objects into output objects. The running time of an algorithm typically grows with the input size. Average case time is often difficult to determine. We focus on the worst case running time. Easier to analyze Crucial to applications such as games, finance and robotics

Analysis of Algorithms24 Experimental Studies Write a program implementing the algorithm Run the program with inputs of varying size and composition Use a function, like the built-in clock() function, to get an accurate measure of the actual running time Plot the results

Analysis of Algorithms25 Limitations of Experiments It is necessary to implement the algorithm, which may be difficult Results may not be indicative of the running time on other inputs not included in the experiment. In order to compare two algorithms, the same hardware and software environments must be used

Analysis of Algorithms26 Theoretical Analysis Uses a high-level description of the algorithm instead of an implementation Characterizes running time as a function of the input size, n. Takes into account all possible inputs Allows us to evaluate the speed of an algorithm independent of the hardware/software environment

Analysis of Algorithms27 Pseudocode High-level description of an algorithm More structured than English prose Less detailed than a program Preferred notation for describing algorithms Hides program design issues Algorithm arrayMax(A, n) Input array A of n integers Output maximum element of A currentMax  A for i  1 to n  1 do if A[i]  currentMax then currentMax  A[i] return currentMax Example: find max element of an array

Analysis of Algorithms28 Pseudocode Details Control flow if … then … [else …] while … do … repeat … until … for … do … Indentation replaces braces Method declaration Algorithm method (arg [, arg…]) Input … Output … Method/Function call method (arg [, arg…]) Return value return expression Expressions  Assignment (like  in C++)  Equality testing (like  in C++) n 2 Superscripts and other mathematical formatting allowed

Analysis of Algorithms29 The Random Access Machine (RAM) Model A CPU A potentially unbounded bank of memory cells, each of which can hold an arbitrary number or character 0 1 2 Memory cells are numbered and accessing any cell in memory takes unit time.

Analysis of Algorithms30 Primitive Operations Basic computations performed by an algorithm Identifiable in pseudocode Largely independent from the programming language Exact definition not important Assumed to take a constant amount of time in the RAM model Examples: Evaluating an expression Assigning a value to a variable Indexing into an array Calling a method Returning from a method

Analysis of Algorithms31 Counting Primitive Operations By inspecting the pseudocode, we can determine the maximum number of primitive operations executed by an algorithm, as a function of the input size Algorithm arrayMax(A, n) # operations currentMax  A 2 for i  1 to n  1 do 2  n if A[i]  currentMax then2(n  1) currentMax  A[i]2(n  1) { increment counter i }2(n  1) return currentMax 1 Total 7n  1

Analysis of Algorithms32 Estimating Running Time Algorithm arrayMax executes 7n  1 primitive operations in the worst case. Define: a = Time taken by the fastest primitive operation b = Time taken by the slowest primitive operation a (7n  1)  T(n)  b(7n  1) Let T(n) be worst-case time of arrayMax. Then a (7n  1)  T(n)  b(7n  1) Hence, the running time T(n) is bounded by two linear functions

Analysis of Algorithms33 Growth Rate of Running Time Changing the hardware/ software environment Affects T(n) by a constant factor, but Does not alter the growth rate of T(n) The linear growth rate of the running time T(n) is an intrinsic property of algorithm arrayMax

Analysis of Algorithms34 Growth Rates Growth rates of functions: Linear  n Quadratic  n 2 Cubic  n 3 In a log-log chart, the slope of the line corresponds to the growth rate of the function

Analysis of Algorithms35 Constant Factors The growth rate is not affected by constant factors or lower-order terms Examples 10 2 n  10 5 is a linear function 10 5 n 2  10 8 n is a quadratic function

Analysis of Algorithms36 Big-Oh Notation Given functions f(n) and g(n), we say that f(n) is O(g(n)) if there are positive constants c and n 0 such that f(n)  cg(n) for n  n 0 Example: 2n  10 is O(n) 2n  10  cn (c  2) n  10 n  10  (c  2) Pick c  3 and n 0  10

Analysis of Algorithms37 Big-Oh Example Example: the function n 2 is not O(n) n 2  cn n  c The above inequality cannot be satisfied since c must be a constant

Analysis of Algorithms38 More Big-Oh Examples 7n-2 7n-2 is O(n) need c > 0 and n 0  1 such that 7n-2  cn for n  n 0 this is true for c = 7 and n 0 = 1 3n 3 + 20n 2 + 5 3n 3 + 20n 2 + 5 is O(n 3 ) need c > 0 and n 0  1 such that 3n 3 + 20n 2 + 5  cn 3 for n  n 0 this is true for c = 4 and n 0 = 21 3 log n + log log n 3 log n + log log n is O(log n) need c > 0 and n 0  1 such that 3 log n + log log n  clog n for n  n 0 this is true for c = 4 and n 0 = 2

Analysis of Algorithms39 Big-Oh and Growth Rate The big-Oh notation gives an upper bound on the growth rate of a function The statement “ f(n) is O(g(n)) ” means that the growth rate of f(n) is no more than the growth rate of g(n) We can use the big-Oh notation to rank functions according to their growth rate f(n) is O(g(n))g(n) is O(f(n)) g(n) grows moreYesNo f(n) grows moreNoYes Same growthYes

Analysis of Algorithms40 Big-Oh Rules If is f(n) a polynomial of degree d, then f(n) is O(n d ), i.e., 1. Drop lower-order terms 2. Drop constant factors Use the smallest possible class of functions Say “ 2n is O(n) ” instead of “ 2n is O(n 2 ) ” Use the simplest expression of the class Say “ 3n  5 is O(n) ” instead of “ 3n  5 is O(3n) ”

Analysis of Algorithms41 Asymptotic Algorithm Analysis The asymptotic analysis of an algorithm determines the running time in big-Oh notation To perform the asymptotic analysis We find the worst-case number of primitive operations executed as a function of the input size We express this function with big-Oh notation Example: We determine that algorithm arrayMax executes at most 7n  1 primitive operations We say that algorithm arrayMax “runs in O(n) time” Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

Analysis of Algorithms42 Computing Prefix Averages We further illustrate asymptotic analysis with two algorithms for prefix averages The i -th prefix average of an array X is average of the first (i  1) elements of X: A[i]  X  X  …  X[i])/(i+1)

Analysis of Algorithms43 Prefix Averages (Quadratic) The following algorithm computes prefix averages in quadratic time by applying the definition Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X #operations A  new array of n integers n for i  0 to n  1 do n s  X n for j  1 to i do 1  2  …  (n  1) s  s  X[j] 1  2  …  (n  1) A[i]  s  (i  1) n return A 1

Analysis of Algorithms44 Arithmetic Progression The running time of prefixAverages1 is O(1  2  …  n) The sum of the first n integers is n(n  1)  2 There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n 2 ) time

Analysis of Algorithms45 Prefix Averages (Linear) The following algorithm computes prefix averages in linear time by keeping a running sum Algorithm prefixAverages2(X, n) Input array X of n integers Output array A of prefix averages of X #operations A  new array of n integersn s  0 1 for i  0 to n  1 don s  s  X[i]n A[i]  s  (i  1) n return A 1 Algorithm prefixAverages2 runs in O(n) time

Analysis of Algorithms46 Computing Spans We show how to use a stack as an auxiliary data structure in an algorithm Given an an array X, the span S[i] of X[i] is the maximum number of consecutive elements X[j] immediately preceding X[i] and such that X[j]  X[i] Spans have applications to financial analysis E.g., stock at 52-week high 63452 11231 X S

Analysis of Algorithms47 Quadratic Algorithm Algorithm spans1(X, n) Input array X of n integers Output array S of spans of X # S  new array of n integers n for i  0 to n  1 don s  1n while s  i  X[i  s]  X[i] 1  2  …  (n  1) s  s  11  2  …  (n  1) S[i]  s n return S 1 Algorithm spans1 runs in O(n 2 ) time

Analysis of Algorithms48 Have nice break!

Analysis of Algorithms49 Recursion Recursion = a function calls itself as a function for unknown times. We call this recursive call for (i = 1 ; i <= n-1; i++) sum = sum +1; int sum(int n) { if (n <= 1) return 1 else return (n + sum(n-1)); }

Analysis of Algorithms50 Recursive function int f( int x ) { if( x == 0 ) return 0; else return 2 * f( x - 1 ) + x * x; }

Analysis of Algorithms51 Recursion Calculate factorial (n!) of a positive integer: n! = n(n-1)(n-2)...(n-n-1), 0! = 1! = 1 int factorial(int n) { if (n <= 1) return 1; else return (n * factorial(n-1)); }

Analysis of Algorithms52 Fibonacci numbers, Bad algorith for n>40 ! long fib(int n) { if (n <= 1) return 1; else return fib(n-1) + fib(n-2); }

Analysis of Algorithms53 Algorithm IterativeLinearSum(A,n) Algorithm IterativeLinearSum(A,n): Input: An integer array A and an integer n (size) Output: The sum of the first n integers if n = 1 then return A else while n  0 do sum = sum + A[n] n  n - 1 return sum

Analysis of Algorithms54 Algorithm LinearSum(A,n) Algorithm LinearSum(A,n): Input: An integer array A and an integer n (size) Output: The sum of the first n integers if n = 1 then return A else return LinearSum(A,n-1) + A[n-1]

Analysis of Algorithms55 Iterative Approach: Algorithm IterativeReverseArray(A,i,n) Algorithm IterativeReverseArray(A,i,n): Input: An integer array A and an integers i and n Output: The reversal of n integers in A starting at index i while n > 1 do swap A[i] and A[i+n-1] i  i +1 n  n-2 return

Analysis of Algorithms56 Algorithm ReverseArray(A,i,n) Algorithm ReverseArray(A,i,n): Input: An integer array A and an integers i and n Output: The reversal of n integers in A starting at index i if n > 1 then swap A[i] and A[i+n-1] call ReverseArray(A, i+1, n-2) return

Analysis of Algorithms57 Higher-Order Recursion Algorithm BinarySum(A,i,n): Input: An integer array A and an integers i and n Output: The sum of n integers in A starting at index i if n = 1 then return A[i] return BinarySum(A,i,[n/2])+BinarySum(A,i+[n/2],[n/2]) Making recursive calls more than a single call at a time.

Analysis of Algorithms58 Kth Fibonacci Numbers

Analysis of Algorithms59 Algorithm BinaryFib(k): Input: An integer k Output: A pair ( ) such that is the kth Fibonacci number and is the (k-1)st Fibonacci number if (k <= 1) then return (k,0) else (i,j)  LinearFibonacci(k-1) return (i+j,i) kth Fibonacci Numbers Linear recursion

Analysis of Algorithms60 kth Fibonacci Numbers Algorithm BinaryFib(k): Input: An integer k Output: The kth Fibonacci number if (k <= 1) then return k else return BinaryFib(k-1)+BinaryFib(k-2) Binary recursion

Analysis of Algorithms61 properties of logarithms: log b (xy) = log b x + log b y log b (x/y) = log b x - log b y log b xa = alog b x log b a = log x a/log x b properties of exponentials: a (b+c) = a b a c a bc = (a b ) c a b /a c = a (b-c) b = a log a b b c = a c*log a b Summations Logarithms and Exponents Proof techniques Basic probability Math you need to Review

Analysis of Algorithms62 Relatives of Big-Oh big-Omega f(n) is  (g(n)) if there is a constant c > 0 and an integer constant n 0  1 such that f(n)  cg(n) for n  n 0 big-Theta f(n) is  (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n 0  1 such that c’g(n)  f(n)  c’’g(n) for n  n 0 little-oh f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant n 0  0 such that f(n)  cg(n) for n  n 0 little-omega f(n) is  (g(n)) if, for any constant c > 0, there is an integer constant n 0  0 such that f(n)  cg(n) for n  n 0

Analysis of Algorithms63 Intuition for Asymptotic Notation Big-Oh f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega f(n) is  (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta f(n) is  (g(n)) if f(n) is asymptotically equal to g(n) little-oh f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n) little-omega f(n) is  (g(n)) if is asymptotically strictly greater than g(n)

Analysis of Algorithms64 Example Uses of the Relatives of Big-Oh f(n) is  (g(n)) if, for any constant c > 0, there is an integer constant n 0  0 such that f(n)  cg(n) for n  n 0 need 5n 0 2  cn 0  given c, the n 0 that satisfies this is n 0  c/5  0 5n 2 is  (n) f(n) is  (g(n)) if there is a constant c > 0 and an integer constant n 0  1 such that f(n)  cg(n) for n  n 0 let c = 1 and n 0 = 1 5n 2 is  (n) f(n) is  (g(n)) if there is a constant c > 0 and an integer constant n 0  1 such that f(n)  cg(n) for n  n 0 let c = 5 and n 0 = 1 5n 2 is  (n 2 )

Similar presentations