Presentation is loading. Please wait.

Presentation is loading. Please wait.

Black Box Software Testing Domain Testing

Similar presentations


Presentation on theme: "Black Box Software Testing Domain Testing"— Presentation transcript:

1 Black Box Software Testing Domain Testing

2 Introductory Notes Domain testing is the most commonly taught (and perhaps the most commonly used) software testing technique. We start in the way it’s traditionally introduced to testers (for example, by Myers and by Kaner, Falk & Nguyen). That is, we develop the concept of equivalence classes and boundaries through careful analysis of a simple example, developing the idea all the way through the test documentation (boundary charts) traditionally recommended for this style of testing. In practice, domain testing isn’t nearly this simple. Simplistic descriptions may do more harm than good by misleading testers into a belief that testing can be handled by a routine set of clearly defined procedures. We’ll study some of the interesting complexities of domain testing.

3 Let's work a simple example
Here is a program’s specification: This program is designed to add two numbers, which you will enter Each number should be one or two digits The program will print the sum. Press Enter after each number To start the program, type ADDER Before you start testing, do you have any questions about the spec?

4 Working through the example
Here’s my basic strategy for dealing with new code: 1 Start with obvious and simple tests. Test the program with easy-to-pass values that will be taken as serious issues if the program fails. Test each function sympathetically. Learn why this feature is valuable before you criticize it. Test broadly before deeply. Check all parts of the program quickly before focusing. 4 Look for more powerful tests. Try boundary conditions. Once the program can survive the easy tests, you need a strategy for choosing powerful tests from all of the candidates. Expand your scope. Put on your thinking cap; look for challenges. Do some freestyle exploratory testing. Run new tests every week, from the first week to the last week of the project.

5 1. The simple, mainstream tests
For the first test, try a pair of easy values, such as 3 plus 7. Here is the screen display that results from that test. Are there any bug reports that you would file from this? ? 3 ? 7 10 ? _

6 2. Test each function sympathetically
Why is this function here? What will the customer want to do with it? What is it about this function that, once it is working, will make the customer happy? Knowing what the customer will want to do with the feature gives you a much stronger context for discovering and explaining what is wrong with the function, or with the function's interaction with the rest of the program.

7 3. Test broadly before deeply
The objective of early testing is to flush out the big problems as quickly as possible. You will explore the program in more depth as it gets more stable. There is no point hammering a design into oblivion if it is going to change. Report as many problems as you think it will take to force a change, and then move on.

8 4. Classical equivalence & boundary analysis
There are 199 values for each variable: 1 to values value -1 to values There are 199 x 199 = 39,601 combination tests Should we test them all?

9 4. Classical equivalence class & boundary analysis
We tested Should we also test 4 + 7? ? 2 + 7? 2 + 8? 3 + 8? ? 3 + 3? 7 + 7? Why? What would you learn from these? What error would you expect one of these other tests to expose that 3+7 would not already have exposed?

10 4. Classical equivalence class & boundary analysis
What about the values not in the spec? 100 and above -100 and below Should we run these tests? Why or why not?

11 4. Classical equivalence class & boundary analysis
Some people want to automate these tests. How would you automate them all? How will you tell whether the program passed or failed? We cannot afford to run every possible test. We need a method for choosing a few powerful tests that will represent the rest. Equivalence analysis is the most widely used approach.

12 4. Classical equivalence class & boundary analysis
To avoid unnecessary testing, partition (divide) the range of inputs into groups of equivalent tests. We treat two tests as equivalent if they are so similar to each other that it seems pointless to test both. Select an input value from the equivalence class as representative of the full group. If you can map the input space to a number line, boundaries mark the point or zone of transition from one equivalence class to another. These are good members of equivalence classes to use because the program is more likely to fail at a boundary. These are fuzzy definitions of equivalence and boundary.

13 Myers’ boundary table The traditional analysis looks at the potential numeric entries and partition them the way the specification would partition them.

14 The classical boundary table
Combination tests of N variables create N-rows with the boundary values of each of the component variables.

15 Boundary table as a test plan component
Makes the reasoning obvious. Makes the relationships between test cases fairly obvious. Expected results are pretty obvious. Several tests on one page. Can delegate it and have tester check off what was done. Provides some limited opportunity for tracking. Not much room for status. Question, now that we have the table, must we do all the tests? What about doing them all each time (each cycle of testing)?

16 Building the table (in practice)
Relatively few programs will come to you with all fields fully specified. Therefore, you should expect to learn what variables exist and their definitions over time. To build an equivalence class analysis over time, put the information into a spreadsheet. Start by listing variables. Add information about them as you obtain it. The table should eventually contain all variables. This means, all input variables, all output variables, and any intermediate variables that you can observe. In practice, most tables that I’ve seen are incomplete. The best ones that I’ve seen list all the variables and add detail for critical variables.

17 Scope of the analysis Several books stop here, or continue in the same direction, but look for ways to reduce the number of combination tests. See, for example: Ilene Burnstein’s Practical Software Testing (2004) Paul Jorgensen’s Software Testing: A Craftsman’s Approach (2nd Ed., 2002) Robert Binder’s Testing Object-Oriented Systems: Models, Patterns & Tools (2000) Boris Beizer’s Black Box Testing: Techniques for Functional Testing of Software & Systems (1995)

18 What does this approach achieve?
This is a systematic sampling approach to test design. We can’t afford to run all tests, so we divide the population of tests into subpopulations and test one or a few representatives of each subgroup. This keeps the number of tests manageable. Using boundary values for the tests offers a few benefits: They will expose any errors that affect an entire equivalence class. They will expose errors that miss-specify a boundary. These can be coding errors (off-by-one errors such as saying “less than” instead of “less than or equal”) or typing mistakes (such as entering 57 instead of 75 as the constant that defines the boundary). Miss-specification can also result from ambiguity or confusion about the decision rule that defines the boundary. Non-boundary values are less likely to expose these errors.

19 Domain analysis on floating point
Do a domain analysis on page width. What's the difference between this and analysis of an integer?

20 Domain analysis on these variables?
Would you do a domain analysis on these variables? What benefit would you gain from it?

21 Examples of ordered sets
So many examples of domain analysis involve databases or simple data input fields that some testers don't generalize. Here's a sample of other variables that fit the traditional equivalence class / boundary analysis mold. ranges of numbers character codes how many times something is done (e.g. shareware limit on number of uses of a product) (e.g. how many times you can do it before you run out of memory) how many names in a mailing list, records in a database, variables in a spreadsheet, bookmarks, abbreviations size of the sum of variables, or of some other computed value (think binary and think digits) size of a number that you enter (number of digits) or size of a character string size of a concatenated string size of a path specification size of a file name size (in characters) of a document The notion of Binary State as a boundary Between algorithms = chi-square approximation numerosity number of usages number of records size sum digits entered time race conditions? timeout followed by , , , , speed of data entry? year 2000 12:00 midnight special cases 24:00 rolls over to 00:01 or 24:01 most recent, first space on page memory < 640K on disk 64 K file system resources, between algorithms, date, intensity, what else? 127

22 Examples of ordered sets
size of a file (note special values such as exactly 64K, exactly 512 bytes, etc.) size of the document on the page (compared to page margins) (across different page margins, page sizes) size of a document on a page, in terms of the memory requirements for the page. This might just be in terms of resolution x page size, but it may be more complex if we have compression. equivalent output events (such as printing documents) amount of available memory (> 128 meg, > 640K, etc.) visual resolution, size of screen, number of colors operating system version variations within a group of “compatible” printers, sound cards, modems, etc. equivalent event times (when something happens) timing: how long between event A and event B (and in which order--races) length of time after a timeout (from JUST before to way after) -- what events are important? 127

23 Examples of ordered sets
speed of data entry (time between keystrokes, menus, etc.) speed of input--handling of concurrent events number of devices connected / active system resources consumed / available (also, handles, stack space, etc.) date and time transitions between algorithms (optimizations) (different ways to compute a function) most recent event, first event input or output intensity (voltage) speed / extent of voltage transition (e.g. from very soft to very loud sound) 127

24 Domain analysis of results
This is the print dialog in Open Office. Suppose that The largest number of copies you could enter in Number of Copies field is 999, OR Your printer will manage multiple copies, for up to 99 copies. For each case, how would you do a traditional domain analysis?

25 Review Question Gerald Weinberg’s Triangle Problem has been in use since about Glen Myers published it in the first book on software testing, The Art of Software Testing, in 1979: The triangle program reads three numbers from a punch card (yes, that’s right, a punch card, so don’t talk about what you’d do with some GUI) and interprets them as the sides of a triangle. The program then states whether the triangle is scalene, equilateral, or isosceles. How would you test this program? (List or describe your tests.) If this program was life-critical, what tests would you add? Why?

26 Myers’ answer to the triangle problem
Test case for a valid scalene triangle Test case for a valid equilateral triangle Three test cases for valid isosceles triangles (a=b, b=c, a=c) One, two or three sides has zero value (5 cases) One side has a negative Sum of two numbers equals the third (e.g. 1,2,3) is invalid b/c not a triangle (tried with 3 permutations a+b=c, a+c=b, b+c=a) Sum of two numbers is less than the third (e.g. 1,2,4) (3 permutations) Non-integer Wrong number of values (too many, too few)

27 Examples of Myers' categories
1. {5,6,7} 2. {15,15,15} 3. {3,3,4; 5,6,6; 7,8,7} 4. {0,1,1; 2,0,2; 3,2,0; 0,0,9; 0,8,0; 11,0,0; 0,0,0} 5. {3,4,-6} 6. {1,2,3; 2,5,3; 7,4,3} 7. {1,2,4; 2,6,2; 8,4,2} 8. {Q,2,3} 9. {2,4; 4,5,5,6}

28 Extending the analysis
Myers included other classes of examples: Non-numeric values Too few inputs or too many Values that fit within the individual field constraints but that combine into an invalid result These are different in kind from tests that go after the wrong-boundary-specified error. Can we do boundary analysis on these? Let’s try it . . .

29 Potential error: Non-numeric values
Character ASCII Code / 47 lower bound 1 49 2 50 3 51 4 52 5 53 6 54 7 55 8 56 upper bound : 58 A 65 a 97 45

30 Potential error: Wrong number of inputs
In the triangle example, the program wanted three inputs The valid class [of integers] is {3} The invalid classes [of integers] are Any number less than 3 (boundary is 2) Any number more than 3 (boundary is 4)

31 Potential error: Invalid combination
Consider these cases. Are these paired tests equivalent? If you tested Would you test

32 Potential error: Invalid combination
The hockey game example Earn 0 points for loss, 1 for tie, 2 for win. Sum of points stored in an unsigned integer Top teams go to the playoffs Up to 80 games—what if you win them all?

33 Potential error: Invalid combination
Consider these cases. Are these paired tests equivalent? If you tested Would you test The hockey game example Once you’ve been burned by this integer overflow bug, will never look the same as again.

34 Another example of non-obvious boundaries
Still in the program Enter the first value Wait N seconds Enter the second value Suppose our client application will time out on input delays greater than 600 seconds. Does this affect how you would test? Suppose our client passes data that it receives to a server, the client has no timeout, and the server times out on delays greater than 300 seconds. Would you discover this timeout from a path analysis of your application? What boundary values should you test? In whose domains? 45

35 More examples of risks for the add-two-numbers example
Memory corruption caused by input value too large. Failure on non-numeric input. Mishandles leading zeroes or leading spaces. Mishandles non-numbers inside number strings. Recovers poorly from its own error handling. Memory leaks.

36 5. Expand the scope What test? Why? (What error are you looking for?)
______________ _______________________________ ______________ _______________________________

37 5. Expand the scope Brainstorming Rules:
The goal is to get lots of ideas. You are brainstorming together to discover categories of possible tests. There are more great ideas out there than you think. Don’t criticize others’ contributions. Jokes are OK, and are often valuable. Eliminate redundancy, cut bad ideas, and refine and optimize the specific tests.

38 Risk-based equivalence
Given the following potential error: _____________________________________________ These cases would not trigger the error, even if it was there. These cases would trigger the error.

39 Extending the analysis
You can use the Myers’ table with an extended scope of errors and tests, but this hides the risks and the natural reasoning triggered by the risks

40 A new boundary / equivalence table
Variable Risk (potential failure) Classes that should not trigger the failure Classes that might trigger the failure Test cases (best representatives) Notes First input Fail on out-of-range values -99 to 99 MinInt to -100 100 to MaxInt -100, 100 Doesn't correctly discriminate in-range from out-of-range -100, -99,100, 99 Misclassify digits Non-digits 0 to 9 0 (ASCII 48) 9 (ASCII 57) Misclassify non-digits Digits 0 - 9 ASCII other than / (ASCII 47) ; (ASCII 58) Note that we’ve dropped the issue of “valid” and “invalid.” This lets us generalize to partitioning strategies that don’t have the concept of “valid” -- for example, printer equivalence classes.

41 Sample Test For each of the following,
List the variable(s) of interest. List the valid and invalid classes. List the boundary value test cases. Lay out the results in a boundary table. 1. FoodVan delivers groceries to customers who order food over the Net. To decide whether to buy more vans, FV tracks the number of customers who call for a van. A clerk enters the number of calls into a database each day. Based on previous experience, the database is set to challenge (ask, “Are you sure?”) any number greater than 400 calls. 2. FoodVan schedules drivers one day in advance. To be eligible for an assignment, a driver must have special permission or she must have driven within 30 days of the shift she will be assigned to.

42 Notes on this Test Even these simple specifications are ambiguous:
Does “within 30 days” mean “less than 30” or “less than or equal to 30” ? When does the special permission have to have been issued? If you can work tomorrow morning on the basis of permission, can you work tomorrow afternoon on the basis of experience? Is tomorrow morning within 30 days of tomorrow afternoon? Do we compute 30 days in days or hours (minutes / seconds)? What result if the last day you worked was 28 days ago? 29 days ago? 30 days ago? Even if you are clear on the answers to these, do you believe that the programmer and the specification writer will come to the same answers?

43 Understanding domain testing
As you just saw in the last example, one of the underlying risks addressed by domain testing is ambiguity. Interpretation of the specification is often most difficult for the boundary cases. This is one of the key reasons that we test equivalence classes at their boundaries rather than at random “equivalent” points inside the set.

44 A new class of example to consider: Non-ordered sets
Let’s discard the notion that a domain must be linear and consider domains that can’t be ordered from small to large. Boundary analysis depends on the existence of boundaries. Theorists often say that domain (boundary) analysis assumes that variables are linearizable (can be mapped to the number line). All we actually need, though is ordinality--a variable is ordinally scaled if its values can be ordered from smallest to largest. A problem: There are about 2000 Windows-compatible printers, plus multiple drivers for each. We can’t test them all. These are not ordered, and so we can never do a boundary analysis of them. However, we might be able to form equivalence classes and choose best representatives.

45 Non-ordered sets Primary groups of printers at that time:
HP - Original HP - LJ II PostScript Level I PostScript Level II Epson 9-pin, etc. LaserJet II compatible printers, huge class (maybe 300 printers, depending on how we define it) 1. Should the class include LJII, LJII+, and LIIP, LJIID-compatible subclasses? 2. What is the best representative of the class?

46 Non-ordered sets Example: graphic complexity error handling
HP II original was the weak case. Example: special forms HP II original was strong in paper-handling. We worked with printers that were weaker in paper-handling. We pick different best representatives from the same equivalence class, depending on which error we are trying to detect. Examples of additional queries for almost-equivalent printers Same margins, offsets on new printer as on HP II original? Same printable area? Same handling of hairlines? (Postscript printers differ.)

47 More examples of non-ordered sets
Here are more examples of variables that don't fit the traditional mold for equivalence classes but which have enough values that we will have to sample from them. What are the boundary cases here? Membership in a common group Such as employees vs. non-employees. Such as workers who are full-time or part-time or contract. Equivalent hardware such as compatible modems, video cards, routers Equivalent output events perhaps any report will do to answer a simple the question: Will the program print reports? Equivalent operating environments such as French & English versions of Windows 3.1 125

48 Understanding domain testing
People were treating values as equivalent long before anyone proposed a theoretical description of domain testing. The most important idea in domain testing is that it provides a sensible basis for sampling from a domain. Definition: Domain In mathematics, The domain of a function is the set of all input values over which the function is defined. The range (or output domain) of the function is the set of all values that the function can produce. Early descriptions of domain testing focused on inputs, but we routinely applied the analysis to outputs

49 Understanding domain testing
In domain testing, we partition a domain into sub-domains (equivalence classes) and then test using values from each sub-domain.

50 Understanding domain testing
1. What is equivalence? 4 views of what makes values equivalent. Each has practical implications Intuitive Similarity: two test values are equivalent if they are so similar to each other that it seems pointless to test both. This is the earliest view and the easiest to teach Little guidance for subtle cases or multiple variables Specified As Equivalent: two test values are equivalent if the specification says that the program handles them in the same way. Testers complain about missing specifications may spend enormous time writing specifications Focus is on things that were specified, but there might be more bugs in the features that were under specified

51 Understanding domain testing
What is equivalence? Equivalent Paths: two test values are equivalent if they would drive the program down the same path (e.g. execute the same branch of an IF) Tester should be a programmer Tester should design tests from the code Some authors claim that a complete domain test will yield a complete branch coverage. No basis for picking one member of the class over another. Two values might take program down same path but have very different subsequent effects (e.g. timeout or not timeout a subsequent program; or e.g. word processor's interpretation and output may be the same but may yield different interpretations / results from different printers.)

52 Understanding domain testing
What is equivalence? Risk-Based: two test values are equivalent if, given your theory of possible error, you expect the same result from each. Subjective analysis, differs from person to person. It depends on what you expect (and thus, what you can anticipate). Two values may be equivalent relative to one potential error but non-equivalent relative to another.

53 Understanding domain testing
Test which values from the equivalence class? Most discussions of domain testing start from several assumptions: The domain is continuous The domain is linearizable (members of the domain can be mapped to the number line) or, at least, the domain is an ordered set (given two elements, one is larger than the other or they are equal) The comparisons that cause the program to branch are simple, linear inequalities

54 Understanding domain testing
Test which values from the equivalence class? Is the program more likely to fail at a boundary? Suppose program design: INPUT < result: Error message 10 <= INPUT < 25 result: Print "hello" 25 <=INPUT result: Error message Some error types Program doesn't like numbers Any number will do Inequalities miss-specified (e.g. INPUT <= 25 instead of < 25) Detect only at boundary Boundary value mistyped (e.g. INPUT < 52, transposition error) Detect at boundary and any other value that will be handled incorrectly Boundary values (here, test at 25) catch all 3 errors Non-boundary values (consider 53) may catch only 1 of the 3 errors

55 Understanding domain testing
Test which values from the equivalence class? The emphasis on boundaries is inherently risk-based But the explicitly risk-based approach goes further Consider many different risks Partitioning driven by risk Selection of values driven by risk: A member of an equivalence class is a best representative (relative to a potential error) if no other member of the class is more likely to expose that error than the best representative. Boundary values are often best representatives We can have best representatives that are not boundary values We can have best representatives in non-ordered domains

56 In sum: equivalence classes and representative values
Two tests belong to the same equivalence class if you expect the same result (pass / fail) of each. Testing multiple members of the same equivalence class is, by definition, redundant testing. In an ordered set, boundaries mark the point or zone of transition from one equivalence class to another. The program is more likely to fail at a boundary, so these are the best members of (simple, numeric) equivalence classes to use. More generally, you look to subdivide a space of possible tests into relatively few classes and to run a few cases of each. You’d like to pick the most powerful tests from each class. We call those most powerful tests the best representatives of the class. with more than 63 games won (more than 127 points scored). 41

57 Interactions among variables
Rather than thinking about a single variable with a single range of values, a variable might have different ranges, such as the day of the month, in a date: 1-28 1-29 1-30 1-31 We analyze the range of dates by partitioning the month field for the date into different sets: {February} {April, June, September, November} {Jan, March, May, July, August, October, December} For testing, you want to pick one of each. There might or might not be a “boundary” on months. The boundaries on the days, are sometimes 1-28, sometimes 1-29, etc

58 Domain Testing Summary
AKA partitioning, equivalence analysis, boundary analysis Fundamental question or goal: This confronts the problem that there are too many test cases for anyone to run. This is a sampling strategy that provides a rationale for selecting a few test cases from a huge population. General approach: Divide the set of possible values of a field into subsets, pick values to represent each subset. The goal is to find a “best representative” for each subset, and to run tests with these representatives. Best representatives of ordered fields will typically be boundary values. Multiple variables: combine tests of several “best representatives” and find a defensible way to sample from the set of combinations. Paradigmatic case(s) Equivalence analysis of a simple numeric field. Printer compatibility testing (multidimensional variable, doesn’t map to a simple numeric field, but stratified sampling is essential.) Was called Equivalence Class analysis Quantify huge possible test space; Dick Bender’s Softtool generates cause-effect graph based test generation from complete requirements his process said that possible test cases in a financial application where Hawkings estimates there are molecules in the universe! Segment the test universe into sub-domains of [likely] equivalent values, then select the one best representative of the class. Used Laser Jet II as best [worst] representative printer for the class; Trick in non-trivial cases is to select the best representative; see Jorgensen’s “Software Testing: A Craftsman’s Approach.” Also, Beizer’s “Black Box Software Testing” is OK.

59 Domain Testing Summary
Strengths Find highest probability errors with a relatively small set of tests. Intuitively clear approach, easy to teach and understand Extends well to multi-variable situations Blind spots or weaknesses Errors that are not at boundaries or in obvious special cases. The "competent programmer hypothesis" can be misleading. Also, the actual domains are often unknowable. Reliance on best representatives for regression testing leads us to over test these cases and under test other values that were as, or almost as, good. One reason that oversimplified, mechanical views of domain testing have lasted so long is that courses often consider the simple cases and stop, moving on to something else.

60 Domain Testing Summary
Domain analysis is a sampling strategy to cope with the problem of too many possible tests. Traditional domain analysis considers numeric input and output fields. Boundary analysis is optimized to expose a few types of errors such as miscoding of boundaries or ambiguity in definition of the valid/invalid sets. However, there are other possible errors that boundary tests are insensitive to. Domain analysis often appears mechanical and routine. Given a numeric input field and its specified boundaries, we know what to do. But as we consider new risks, we have to add a new analysis and new tests. Rather than thinking we can pre-specify all the tests (after predicting all the risks), we should train testers in the application of equivalence classes to risk-based tests in general. As they discover new risks associated with a field (or with anything else) while testing, they can apply the analysis to come up with optimized new tests as needed.


Download ppt "Black Box Software Testing Domain Testing"

Similar presentations


Ads by Google