COSO 1030 Section 2 Software Engineering Concepts and Computation Complexity
What is about The Art of Programming Software Engineering Structural Programming Correctness Testing & Verification Efficiency & the Measurement
The Art of Computer Programming
The Era of The Art Small Program Small Team (mainly one person) Limited Resources Limited Applications Programmers = Mathematicians Master Peaces It was 50-60’s
Software Engineering What is software engineering? – Discipline – Methodology – Tools Why needs SE? – Increased Demand – Larger Applications – Software Firms
Software Engineering Requirement and Specification Structural Programming Methodology Correctness and Validation Efficiency Measurement Development and Maintenance Management Formal Methods and CASE tools
Life cycle of software develop Requirement Acquisition – Requirement Docs Architectural Design – Software Specification Component Design – Detail Specification Coding, Debugging and Testing – Code and test case Integration and Testing – Deployable Software Deployment – Put into production Maintenance – Make sure it runs healthily
Structural Programming Top-Down programming – High level abstraction – Pseudo code Describe ideas Use as comment in Java – Stepwise refinement Introduce helper functions Insert method call to implement pseudo code Modularity – Hide implementation detail – Maintain a simple interface – Incremental compilation or build
Requirement and Specification Requirement – What the user wants – Functional Requirement Component Functionality, Coordination, UI, … – Non-functional Requirements Deadline, Budget, Response Time, … Specification – What programmers should know – Interface between components – Pre-post conditions of a function – User Interface specification – Performance Specification
Program Aspects Correctness Validity Efficiency Usability Extendibility Readability Reusability Modularity
Correctness Meet functional specification – All valid input produce output that meets the spec – All invalid input generate output that tells the error Only respect to specification – Does not mean valid or acceptable
Formal Specification Formal vs. informal specification – Use math language vs. natural language Advantages – Precisely defined, not ambiguous – Formal method to prove correctness Disadvantage – Not easy to understand – Not easy to describe common cense
Formal Spec (cont.) Mathematic logic – Propositional Math Logic – First order Math Logic For all x P(x), There exists x such that P(x) – Temporal Mathematic Logic Functional specification – Pre and post-conditions – Post-condition must be meet at the end of a program, provided that input satisfies pre-condition
Proving Correctness Assertion – Claims the condition specified in an assertion must be satisfied at the time the program runs through the assertion. – Pre and post condition – Loop invariant {Precondition} P {Postcondition} if(precondition) { P; assert(postcondition); }
Informal specification – Given width and height, calculate the area of a rectangle. Formal specification – Pre condition: {width > 0 and height > 0} – double area(double width, double height) – Post condition: {area = width * height} Assertion double area(double width, double height) { assert(width > 0 && height > 0); // pre condition double area = …… assert(area == width * height); // post condition return area; }
double area(double width, double height) { assert(width > 0 && height > 0); // pre condition double area = height * width; assert(area == width * height); // post condition return area; } Assume width > 0 and height > 0, we need to prove area = width * height. Since area = height * width, we only need to prove height * width = width * height. It is true because we can swap operands of a multiplication.
Loop Invariant Assure that the condition always true with in the loop Help to prove postcondition of the loop { pre: a is of int[0..n-1] and n > 0 and i = 0 and m = a[0]} While(i < n) { if(a[i] > m) m = a[i]; { invariant: m >= a[0..i] } i = i + 1; } { post: m >= a[0..n-1] }
Precondition: a is of int[0:n-1] and n > 0 and i = 0 and m = a[0] 1. Foundation: i = 0 and m = a[0] thus m >= a[0:i] = a[0] 2. Induction: Assume m >= a[0:i-1] where i < n-1 1. If a[i] > m then a[i] > a[0:i-1] or a[i] >= a[0:i] In this case, m = a[i] is executed. Thus m >= a[0:i] 2. If a[i] = a[0:i] Thus m >= a[0:i] for any i < n 3. Conclusion: for any i = a[0:i] Post condition m >=a[0:n-1] is true because i=n
Validity The program meets the intention requirement of user How to describe the intention? – Requirement documents - informal – Can requirement documents fully describe the intention? – impossible – What about intension was wrong? How to verify? – By testing
Testing Verify program by running the program and analysis input-output Can find bugs, but can’t assure no bug Tests done by developers – Unit test – Integration test – Regression test Tests done by users – User satisfaction test – Regression test
While box testing Developers run tests when develop the program Button-up testing – Test basic components first – Build-up tested layers for higher layer testing – Mutually recursive methods Check points – Print trace information on critical spots – Make sure check points cover all execution paths – Provide input and check the intentional execution path is executed
While Box Testing (cont.) Checking boundary & invalid input – Array boundary – 0, null, not a positive number, … Checking initialization – Local variables holds random values javac force you initialize them. – Field variables set to null, 0, false by default – Most common exception – null point exception Checking re-entering – Does the object or method hold the assumed value?
Black Box Testing Without knowing implementation, test against requirement or specification Provide sample data, check result – test cases Sample data – Positive samples Typical, special case – Negative samples Invalid input, unreasonable data Sequences of input – Does sequence make any difference?
Efficiency Only use reasonable resources – CPU time – Memory & disk space – Network connection or database connection In a reasonable period – Allocate memory only when it is needed – Close connection when no longer needed Trade off between time and other resources
Efficiency measurement Problem size n Number of instructions The curve or function of time against n
Curves of time vs. size
Functions and Analysis F1(n) = *n^ *n F2(n) = *n^ *n Dominant terms – When n getting bigger, the term of a*n^2 contributes 98% of the value Simplified functions – F1’(n) = *n^2 – F2’(n) = *n^2 – F1’(n)/F2’(n) = – measures the difference of two machines O(n^2)
Complexity Classes Adjective NameO-Notation ConstantO(1) LogarithmicO(log n) LinearO(n) n log nO(n log n) QuadraticO(n^2) CubicO(n^3) ExponentialO(2^n)
Running Time (µsec) f(n) n=2n=16n=256n=1024n= Log n e12.0 e1 n21.6e12.56 e21.02 e31.05 e6 n log n26.4e12.05 e31.02 e42.10 e7 n^242.56e26.55 e41.05 e61.10 e12 n^384.10e31.68 e71.07 e91.10 e18 2^n46.55e41.16e e e315652
Hardware Vs. Software Morgan's Law – CPU runs 2 times faster each year Wait at least ¼ million years for a computer that can solve problem sized in 100 years with an O(2^n) algorithm Never think algorithm is not important because computers run faster and faster
Definition of O-Notation Definition of O-Notation f(n) is of O(g(n)) if there exist two positive constants K and n0 such that |f(n)| = n0 – Where g(n) can be one of the complexity class function – K can be the co-efficient ratio two functions – n0 is the turn point at where O-notation takes effect O(1)<O(log n)<O(n)<O(n log n)<O(n^2)<O(n^3)<O(2^n) A problem P is of O(g(n) if there is an algorithm A of O(g(n) that can solve the problem. Sort is of O(n log n)
O-Arithmetic O(O(F)) = O(F) O(F + G) = O (H) where H = max(F, G) O(F*G) = O(F*O(G)) = O(O(F)*G) = O(O(F) * O(G))
Evaluate Program Complexity If no loop, no recursion O(1) One level of loop For(int i = 0; i < n; n++) { F(n, i); } Is of O(n) where F(n, i) is of O(1) Nested loop is of O(n*g(n,i)) where g(n,i) is the complexity class of F(n, i)
Recursion Complexity Analysis Fibonacci function – F(0) = 1 – F(1) = 1 – F(n) = F(n-1) + F(n-2) where n >= 2 F(n) >> F(n-1), F(n-2) >> F(n-3), F(n-2), F(n-4), F(n-3) >> F(n-5), F(n-4), F(n-4), F(n-3), F(n-6), F(n-5), F(n-5), F(n-4) F(n) is of O(n^2)
Summary of the section Top-down development break down the complexity of problems Pre & post condition and loop invariant Proof correctness of simple programs White box and black box testing Test can’t guarantee correctness or validness
Summary (cont) Measure complexity using O-notation F(n) is of O(g(n) if |F(n)| = n0 Complexity classes Complexity of a loop Complexity of recursion