Presentation is loading. Please wait.

Presentation is loading. Please wait.

Model-Driven Test Design Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA

Similar presentations


Presentation on theme: "Model-Driven Test Design Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA"— Presentation transcript:

1 Model-Driven Test Design Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA www.cs.gmu.edu/~offutt/offutt@gmu.edu

2 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt2 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

3 IDGA Military Test & Evaluation Summit© Jeff Offutt3 Here! Test This! MicroSteff – big software system for the mac V.1.5.1 Jan/2001 Verdatim DataLife MF2-HD 1.44 MB Big software program Jan/2007 My first “professional” job A stack of computer printouts—and no documentation

4 IDGA Military Test & Evaluation Summit© Jeff Offutt4 Cost of Testing In the real-world, testing is the principle post- design activity Restricting early testing usually increases cost Extensive hardware-software integration requires more testing You’re going to spend at least half of your development budget on testing, whether you want to or not

5 IDGA Military Test & Evaluation Summit© Jeff Offutt5 Part 1 : Why Test? Written test objectives and requirements are rare What are your planned coverage levels? How much testing is enough? Common objective – spend the budget … If you don’t know why you’re conducting a test, it won’t be very helpful

6 IDGA Military Test & Evaluation Summit© Jeff Offutt6 Why Test? 1980: “The software shall be easily maintainable” Threshold reliability requirements? What fact is each test trying to verify? Requirements definition teams should include testers! If you don’t start planning for each test when the functional requirements are formed, you’ll never know why you’re conducting the test

7 IDGA Military Test & Evaluation Summit© Jeff Offutt7 Cost of Not Testing Not testing is even more expensive Planning for testing after development is prohibitively expensive A test station for circuit boards costs half a million dollars … Software test tools cost less than $10,000 !!! Program Managers often say: “Testing is too expensive.”

8 Testing & Gov Procurement A common government model is to develop systems –Usually by procuring components –Integrating them into a fully working system –Software is merely one component This model has many advantages … But some disadvantages … –Little control over the quality of components –Lots and lots of really bad software 21 st century software is more than a “component” –Software is the brains that defines systems core behavior –Government must impose testing requirements IDGA Military Test & Evaluation Summit© Jeff Offutt8

9 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt9 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

10 Software Testing—Academic View 1970s and 1980s : Academics looked almost exclusively at unit testing –Meanwhile industry & government focused almost exclusively on system testing 1990s : Some academics looked at system testing, some at integration testing –Growth of OO put complexity in the interconnections 2000s : Academics trying to move our rich collection of ideas into practice –Reliability requirements in industry & government are increasing exponentially IDGA Military Test & Evaluation Summit© Jeff Offutt10

11 Academics and Practitioners Academics focus on coverage criteria with strong bases in theory—quantitative techniques –Industry has focused on human-driven, domain- knowledge based, qualitative techniques Practitioners said “criteria-based coverage is too expensive” –Academics say “human-based testing is more expensive and ineffective” IDGA Military Test & Evaluation Summit© Jeff Offutt11 Practice is going through a revolution in what testing means to the success of software products

12 My Personal Evolution In the ’80s, my PhD work was on unit testing –That is all I saw in testing In the ’90s I recognized that OO put most of the complexity into software connections –Integration testing I woke up in the ’00s … –Web apps need to be very reliable and system testing is often appropriate –Agitar and Certess convinced me that we must relax theory to put our ideas into practice IDGA Military Test & Evaluation Summit© Jeff Offutt12 Testers ain’t mathematicians !

13 IDGA Military Test & Evaluation Summit© Jeff Offutt13 They’re teaching a new way of plowing over at the Grange tonight - you going? Naw - I already don’t plow as good as I know how... “Knowing is not enough, we must apply. Willing is not enough, we must do.” — Goethe Tech Transition in the 1990s

14 Failures in Production Software NASA’s Mars lander, September 1999, crashed due to a units integration fault—over $50 million US ! Huge losses due to web application failures –Financial services : $6.5 million per hour –Credit card sales applications : $2.4 million per hour In Dec 2006, amazon.com’s BOGO offer turned into a double discount 2007 : Symantec says that most security vulnerabilities are due to faulty software Stronger testing could solve most of these problems IDGA Military Test & Evaluation Summit© Jeff Offutt14 World-wide monetary loss due to poor software is staggering

15 Testing in the 21st Century We are going through a dramatic time of change Software defines behavior –network routers, finance, switching networks, other infrastructure Today’s software market : –is much bigger –is more competitive –has more users The way we use software continues to expand Software testing theory is very advanced Yet practice continues to lag IDGA Military Test & Evaluation Summit© Jeff Offutt15 Industry & government orgs are going through a revolution in what testing means to the success of software products

16 Testing in the 21st Century The web offers a new deployment platform –Very competitive and very available to more users Enterprise applications means bigger programs, more users Embedded software is ubiquitous … check your pockets ! Paradoxically, free software increases our expectations ! Security is now all about software faults –Secure software is reliable software Agile processes put enormous pressure on testers and programmers to test better IDGA Military Test & Evaluation Summit© Jeff Offutt16 No theory, research and little practical knowledge in how to test integrated software systems

17 How to Improve Testing ? We need more and better software tools –A stunning increase in available tools in the last 10 years! We need to adopt practices and techniques that lead to more efficient and effective testing –More education –Different management organizational strategies Testing / QA teams need to specialize more –This same trend happened for development in the 1990s Testing / QA teams need more technical expertise –Developer expertise has been increasing dramatically IDGA Military Test & Evaluation Summit© Jeff Offutt17

18 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt18 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

19 Test Design in Context Test Design is the process of designing input values that will effectively test software Test design is one of several activities for testing software –Most mathematical –Most technically challenging This process is based on my text book with Ammann, Introduction to Software Testing IDGA Military Test & Evaluation Summit© Jeff Offutt19 http://www.cs.gmu.edu/~offutt/softwaretest/

20 Types of Test Activities Testing can be broken up into four general types of activities 1.Test Design 2.Test Automation 3.Test Execution 4.Test Evaluation Each type of activity requires different skills, background knowledge, education and training No reasonable software development organization uses the same people for requirements, design, implementation, integration and configuration control IDGA Military Test & Evaluation Summit© Jeff Offutt20 Why do test organizations still use the same people for all four test activities?? This clearly wastes resources 1.a) Criteria-based 1.b) Human-based

21 Summary of Test Activities These four general test activities are quite different It is a poor use of resources to use people inappropriately IDGA Military Test & Evaluation Summit© Jeff Offutt21 1a.DesignDesign test values to satisfy engineering goals CriteriaRequires knowledge of discrete math, programming and testing 1b.DesignDesign test values from domain knowledge and intuition HumanRequires knowledge of domain, UI, testing 2.AutomationEmbed test values into executable scripts Requires knowledge of scripting 3.ExecutionRun tests on the software and record the results Requires very little knowledge 4.EvaluationEvaluate results of testing, report to developers Requires domain knowledge

22 Model-Driven Test Design – Steps IDGA Military Test & Evaluation Summit© Jeff Offutt22 software artifacts model / structure test requirements refined requirements / test specs input values test cases test scripts test results pass / fail IMPLEMENTATION ABSTRACTION LEVEL DESIGN ABSTRACTION LEVEL mathematical analysis criterionrefine generate prefix postfix expected automate execute evaluate test requirements domain analysis feedback

23 MDTD – Activities IDGA Military Test & Evaluation Summit© Jeff Offutt23 software artifact model / structure test requirements refined requirements / test specs input values test cases test scripts test results pass / fail IMPLEMENTATION ABSTRACTION LEVEL DESIGN ABSTRACTION LEVEL Test Design Test Execution Test Evaluation Raising our abstraction level makes test design MUCH easier Herebemath Test Design

24 Using MDTD in Practice This approach lets one test designer do the math Then traditional testers and programmers can do their parts –Find values –Automate the tests –Run the tests –Evaluate the tests IDGA Military Test & Evaluation Summit© Jeff Offutt24 Testers ain’t mathematicians !

25 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt25 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

26 GMU’s Software Engineering Lab At GMU’s Software Engineering Lab, we are committed to finding practical solutions to real problems IDGA Military Test & Evaluation Summit© Jeff Offutt26 Useful research in making better software We are inventing new testing solutions Web applications and web services Critical software Investigating current state of the practice and comparing with available techniques

27 Mismatch in Needs and Goals Industry & contractors want simple and easy testing –Testers with no background in computing or math Universities are graduating scientists –Industry needs engineers Testing needs to be done more rigorously Agile processes put lots of demands on testing –Programmers have to do unit testing – with no training, education or tools ! –Tests are key components of functional requirements – but who builds those tests ? IDGA Military Test & Evaluation Summit© Jeff Offutt27 Bottom line result—lots of crappy software

28 How to Improve Testing ? Testers need more and better software tools Testers need to adopt practices and techniques that lead to more efficient and effective testing –More education –Different management organizational strategies Testing / QA teams need more technical expertise –Developer expertise has been increasing dramatically Testing / QA teams need to specialize more –This same trend happened for development in the 1990s IDGA Military Test & Evaluation Summit© Jeff Offutt28

29 Quality of Industry Tools A recent evaluation of three industrial automatic unit test data generators : –Jcrasher, TestGen, JUB –Generate tests for Java classes –Evaluated on the basis of mutants killed Compared with two test criteria –Random test generation (special-purpose tool) –Edge coverage criterion (by hand) Eight Java classes –61 methods, 534 LOC, 1070 faults ( seeded by mutation ) IDGA Military Test & Evaluation Summit© Jeff Offutt29 — Shuang Wang and Jeff Offutt, Comparison of Unit-Level Automated Test Generation Tools, Mutation 2009

30 Unit Level ATDG Results IDGA Military Test & Evaluation Summit© Jeff Offutt30 These tools essentially generate random values !

31 Quality of Criteria-Based Tests In another study, we compared four test criteria –Edge-pair, All-uses, Prime path, Mutation –Generated tests for Java classes –Evaluated on the basis of finding hand-seeded faults Twenty-nine Java packages –51 classes, 174 methods, 2909 LOC Eighty-eight faults IDGA Military Test & Evaluation Summit© Jeff Offutt31 — Nan Li, Upsorn Praphamontripong and Jeff Offutt, An Experimental Comparison of Four Unit Test Criteria: Mutation, Edge-Pair, All-uses and Prime Path Coverage, Mutation 2009

32 Criteria-Based Test Results IDGA Military Test & Evaluation Summit© Jeff Offutt32 Researchers have invented very powerful techniques

33 Industry and Research Tool Gap We cannot compare these two studies directly However, we can compare the conclusions : –Industrial test data generators are ineffective –Edge coverage is much better than the tests the tools generated –Edge coverage is by far the weakest criterion Biggest challenge was hand generation of tests Software companies need to test better And luckily, we have lots of room for improvement! IDGA Military Test & Evaluation Summit© Jeff Offutt33

34 Four Roadblocks to Adoption 1.Lack of test education 2.Necessity to change process 3.Usability of tools 4.Weak and ineffective tools IDGA Military Test & Evaluation Summit© Jeff Offutt34 Bill Gates says half of MS engineers are testers, programmers spend half their time testing Number of UG CS programs in US that require testing ? 0 Number of MS CS programs in US that require testing ? Number of UG testing classes in the US ? 0 ~10 Most test tools don’t do much – but most users do not realize they could be better Adoption of many test techniques and tools require changes in development process Many testing tools require the user to know the underlying theory to use them This is very expensive for most software companies Do we need to understand an internal combustion engine to drive ? Do we need to understand parsing and code generation to use a compiler ? Few tools solve the key technical problem – generating test values automatically

35 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt35 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

36 General Problems with Web Apps The web offers a new deployment platform –Very competitive and very available to more users –Web apps are distributed –Web apps must be highly reliable Web applications are heterogeneous, dynamic and must satisfy very high quality attributes Use of the Web is hindered by low quality Web sites and applications Web applications need to be built better and tested more Most software faults are introduced during maintenance Difficult because the web uses new and novel technologies IDGA Military Test & Evaluation Summit© Jeff Offutt36

37 Technical Web App Issues IDGA Military Test & Evaluation Summit© Jeff Offutt37 1. Software components are extremely loosely coupled 2. Potential control flows change dynamically 3. State management is completely different –HTTP is stateless –Coupled through the Internet – separated by space –Coupled to diverse hardware and software applications –User control – back buttons, URL rewriting, refresh, caching –Server – redirect, forward, include, event listeners – HTTP is stateless and software is distributed – Traditional object oriented scopes are not available – Page, request, session, application scope …

38 © Jeff Offutt38 Input Space Grammars The input space can be described in many ways –User manuals –Unix man pages –Method signature / Collection of method preconditions –A language Most input spaces can be described as grammars Grammars are usually not provided, but creating them is a valuable service by the tester –Errors will often be found simply by creating the grammar Input Space The set of allowable inputs to software IDGA Military Test & Evaluation Summit

39 © Jeff Offutt39 Using Input Grammars Software should reject or handle invalid data Programs often do this incorrectly Some programs (rashly) assume all input data is correct Even if it works today … –What about after the program goes through some maintenance changes ? –What about if the component is reused in a new program ? Consequences can be severe … –The database can be corrupted –Users are not satisfied –Most security vulnerabilities are due to unhandled exceptions … from invalid data IDGA Military Test & Evaluation Summit

40 © Jeff Offutt40 Validating Web App Inputs Before starting to process inputs, wisely written programs check that the inputs are valid How should a program recognize invalid inputs ? What should a program do with invalid inputs ? If the input space is described as a grammar, a parser can check for validity automatically –This is very rare –It is easy to write input checkers – but also easy to make mistakes Input Validation Deciding if input values can be processed by the software IDGA Military Test & Evaluation Summit

41 Representing Input Domains IDGA Military Test & Evaluation Summit© Jeff Offutt41 goal Desired inputs (goal domain) specified Described inputs (specified domain) implemented Accepted inputs (implemented domain)

42 Representing Input Domains Goal domains are often irregular IDGA Military Test & Evaluation Summit© Jeff Offutt42 Goal domain for credit cards † –First digit is the Major Industry Identifier (bank, government, …) –First 6 digits and length specify the issuer –Final digit is a “check digit” –Other digits identify a specific account † More details are on : http://www.merriampark.com/anatomycc.htm Common specified domain –First digit is in { 3, 4, 5, 6 } (travel and banking) –Length is between 13 and 16 Common implemented domain –All digits are numeric All digits are numeric

43 Representing Input Domains IDGA Military Test & Evaluation Summit© Jeff Offutt43 goal goal domain specified specified domain implemented implemented domain This region is a rich source of software errors …

44 Web Application Input Validation Sensitive Data Bad Data Corrupts data base Crashes server Security violations Check data Malicious Data Can “bypass” data checking Client Server IDGA Military Test & Evaluation Summit44© Jeff Offutt

45 Bypass Testing Results IDGA Military Test & Evaluation Summit© Jeff Offutt45 v — Vasileios Papadimitriou. Masters thesis, Automating Bypass Testing for Web Applications, GMU 2006

46 Theory to Practice—Bypass Testing Inventions from scientists are slow to move into practice Wanted to investigate whether the obstacles are : 1.Technical difficulties of applying to practical use 2.Social barriers 3.Business constraints Tried to technology transition bypass testing to the research arm of Avaya Research Labs IDGA Military Test & Evaluation Summit© Jeff Offutt46 — Offutt, Wang and Ordille, An Industrial Case Study of Bypass Testing on Web Applications, ICST 2008

47 Avaya Bypass Testing Results Six screens were tested Tests are invalid inputs – exceptions are expected Effects on back-end were not checked –Failure analysis based on response screens IDGA Military Test & Evaluation Summit© Jeff Offutt47 Web ScreenTestsFailing TestsUnique Failures Points of Contact 422312 Time Profile 5323 Notification Profile 3412 6 Notification Filter 2616 7 Change PIN 5 1 1 Create Account 241714 TOTAL1849263 33% “efficiency” rate is spectacular!

48 OUTLINE IDGA Military Test & Evaluation Summit© Jeff Offutt48 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software 1.Why Do We Test? 2.Poor Testing Has Real Costs 3.Model-Driven Test Design 4.Industrial Software Problems 5.Testing Solutions that Work 1.Web Applications are Different 2.Input Validation Testing 3.Bypass Testing 6.Coming Changes in How We Test Software

49 Needs From Researchers 1. Isolate : Invent processes and techniques that isolate the theory from most test practitioners 2. Disguise : Discover engineering techniques, standards and frameworks that disguise the theory 3. Embed : theoretical ideas in tools 4. Experiment : Demonstrate economic value of criteria-based testing and ATDG –Which criteria should be used and when ? –When does the extra effort pay off ? 5. Integrate high-end testing with development IDGA Military Test & Evaluation Summit© Jeff Offutt49

50 Needs From Educators 1. Disguise theory from engineers in classes 2. Omit theory when it is not needed 3. Restructure curriculum to teach more than test design and theory –Test automation –Test evaluation –Human-based testing –Test-driven development IDGA Military Test & Evaluation Summit© Jeff Offutt50

51 Changes in Practice 1. Reorganize test and QA teams to make effective use of individual abilities –One math-head can support many testers 2. Retrain test and QA teams –Use a process like MDTD –Learn more of the concepts in testing 3. Encourage researchers to embed and isolate –We are very responsive to research grants 4. Get involved in curricular design efforts through industrial advisory boards IDGA Military Test & Evaluation Summit© Jeff Offutt51

52 Future of Software Testing 1.Increased specialization in testing teams will lead to more efficient and effective testing 2.Testing and QA teams will have more technical expertise 3.Developers will have more knowledge about testing and motivation to test better 4. Agile processes puts testing first—putting pressure on both testers and developers to test better 5.Testing and security are starting to merge 6.We will develop new ways to test connections within software-based systems IDGA Military Test & Evaluation Summit© Jeff Offutt52

53 © Jeff Offutt53Contact Jeff Offutt offutt@gmu.eduhttp://cs.gmu.edu/~offutt/ IDGA Military Test & Evaluation Summit


Download ppt "Model-Driven Test Design Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA"

Similar presentations


Ads by Google