LAB OF SOFTWARE ARCHITECTURES
AND INFORMATION SYSTEMS
FACULTY OF INFORMATICS
MASARYK UNIVERSITY, BRNO
PV260 - SOFTWARE QUALITY
LECT 7. Requirements and Test Cases. From Unit Testing
to Integration Testing
Bruno Rossi
brossi@mail.muni.cz
2-106
●
Software Testing
→ Introduction
→ Basic Principles
●
From Requirements to Test Cases
→ Functional testing
→ Translating specifications into test cases
●
From Unit Testing to Integration Testing
→ comparison with unit testing
→ Strategies
●
Specific Issues in Testing Object Oriented Software
Outline
"Discovering the unexpected is"Discovering the unexpected is
more important than confirmingmore important than confirming
the known."the known."
George BoxGeorge Box
4-106
●
In Eclipse and Mozilla, 30–40% of all changes are fixes
(Sliverski et al., 2005)
●
Fixes are 2–3 times smaller than other changes (Mockus
+Votta, 2000)
●
4% of all one-line changes introduce new errors
(Purushothaman + Perry, 2004)
Introduction
A. Zeller, Why Programs Fail, Second Edition: A Guide to Systematic Debugging,
2 edition. Amsterdam ; Boston: Morgan Kaufmann, 2009. 
5-106
Motivating Examples
A. Zeller, Why Programs Fail, Second Edition: A Guide to Systematic Debugging,
2 edition. Amsterdam ; Boston: Morgan Kaufmann, 2009. 
6-106
●
“Testing is the process of exercising or evaluating a system
or system component by manual or automated means to
verify that it satisfies specified requirements.” IEEE
standards definition
What is Software Testing
7-106
Reminder for some important terms:
●
Defect: “An imperfection or deficiency in a work product where that work
product does not meet its requirements or specifications and needs to be
either repaired or replaced.”
●
Error: “A human action that produces an incorrect result”
●
Failure: “(A) Termination of the ability of a product to perform a required
function or its inability to perform within previously specified limits.
(B) An event in which a system or system component does not perform a
required function within specified limits.
A failure may be produced when a fault is encountered→ . “
●
Fault: “A manifestation of an error in software.”
●
Problem: “(A) Difficulty or uncertainty experienced by one or more
persons, resulting from an unsatisfactory encounter with a system in use.
(B) A negative situation to overcome”
What is Software Testing
Definitions according to IEEE Std 1044-2009 “IEEE Standard Classification for Software Anomalies“
8-106
Hopefully you haven't seen some of these
9-106
Maybe some of these...
10-106
And defects are everywhere...
This is one failure I encountered when preparing this
presentation on LibreOffice 4.2.7.2
A formula in ppt that got converted
into image – looks good when
editing
The slides preview on the left, looks
a bit strange...
When converted to pdf...
11-106
Where is the term “bug”?
●
Very often a synonymous of “defect” so that “debugging” is the
activity related to removing defects in code
However:
→ it may lead to confusion: it is not rare the case in which “bug” is
used in natural language to refer to different levels:
“this line is buggy” - “this pointer being null, is a bug” - “the
program crashed: it's a bug”
→ starting from Dijkstra, there was the search for terms that could
increase the responsibility of developers – the term “bug” might give
the impression of something that magically appears into software
What about the term “Bug”?
Definitions according to IEEE Std 1044-2009 “IEEE Standard Classification for Software Anomalies“
12-106
Who's to blame?
image from http://blog.smartbear.com/sqc/when-bad-software-requirements-happen-to-good-people
13-106
●
Sensitivity: better to fail every time than sometimes
●
Redundancy: making intentions explicit
●
Restrictions: making the problem easier
●
Partition: divide and conquer
●
Visibility: making information accessible
●
Feedback: applying lessons from experience in process
and techniques
Basic Principles of Testing
(c) 2007 Mauro Pezzè & Michal Young
14-106
●
Consistency helps:
– a test selection criterion works better if every selected test
provides the same result, i.e., if the program fails with one
of the selected tests, it fails with all of them (reliable
criteria)
– run time deadlock analysis works better if it is machine
independent, i.e., if the program deadlocks when analyzed on
one machine, it deadlocks on every machine
Sensitivity: better to fail every time than sometimes
(c) 2007 Mauro Pezzè & Michal Young
15-106
●
Look at the following code fragment
Sensitivity Example
char before[] = “=Before=”;
char middle[] = “Middle”;
char after [] = “=After=”;
int main(int argc, char *argv){
strcpy(middle, “Muddled”); /* fault, may not fail */
strncpy(middle, “Muddled”, sizeof(middle)); /* fault, may not fail */
}
What's the problem?
(c) 2007 Mauro Pezzè & Michal Young
16-106
●
Let's make the following adjustment
Sensitivity Example
char before[] = “=Before=”;
char middle[] = “Middle”;
char after [] = “=After=”;
int main(int argc, char *argv){
strcpy(middle, “Muddled”); /* fault, may not fail */
strncpy(middle, “Muddled”, sizeof(middle)); /* fault, may not fail */
stringcpy(middle, “Muddled”, sizeof(middle)); /* guaranteed to fail */
}
void stringcpy(char *target, const char *source, int size){
assert(strlen(source) < size);
strcpy(target, source);
}
This adds sensitivity to a
non-sensitive solution
(c) 2007 Mauro Pezzè & Michal Young
17-106
●
Let's look at the following Java code fragment. We use the ArrayList as a
sort of queue and we remove one item after printing the results
Sensitivity Example
public class TestIterator {
public static void main(String args[]) {
List<String> myList = new ArrayList<>();
myList.add("PV260");
myList.add("SW");
myList.add("Quality");
Iterator<String> it = myList.iterator();
while (it.hasNext()) {
String value = it.next();
System.out.println(value);
myList.remove(value);
}
}
} Will this output
“PV260
SW
Quality” ?
18-106
●
Let's look at the following Java code fragment. We use the ArrayList as a
sort of queue and we remove one item after printing the results
Sensitivity Example
public class TestIterator {
public static void main(String args[]) {
List<String> myList = new ArrayList<>();
myList.add("PV260");
myList.add("SW");
myList.add("Quality");
Iterator<String> it = myList.iterator();
while (it.hasNext()) {
String value = it.next();
System.out.println(value);
myList.remove(value);
}
}
} Actually, this throws
java.util.ConcurrentModificationException
19-106
●
From Java SE documentation:
●
“[...] Some Iterator implementations (including those of all the general
purpose collection implementations provided by the JRE) may choose to
throw this exception if this behavior is detected. Iterators that do this are
known as fail-fast iterators, as they fail quickly and cleanly, rather that
risking arbitrary, non-deterministic behavior at an undetermined time in
the future.”
●
“Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it
would be wrong to write a program that depended on this exception for
its correctness: ConcurrentModificationException should be used only
to detect bugs.”
Sensitivity Example
20-106
• Redundant checks can increase the capabilities of catching
specific faults early or more efficiently.
– Static type checking is redundant with respect to dynamic
type checking, but it can reveal many type mismatches
earlier and more efficiently.
– Validation of requirement specifications is redundant
with respect to validation of the final software, but can reveal
errors earlier and more efficiently.
– Testing and proof of properties are redundant, but are
often used together to increase confidence
Redundancy: making intentions explicit
(c) 2007 Mauro Pezzè & Michal Young
21-106
• Adding redundancy by asserting that a condition must
always be true for the correct execution of the program
Redundancy Example
void save(File *file, const char *dest){
assert(this.isInitialized());
...
}
• From a language (e.g. Java) point of view, why are we obliged to
declare the exception we throw from a method - isn't this
redundant?
public void throwException() throws FileNotFoundException{
throw new FileNotFoundException();
}
Think if you could throw any exception from a method
without declaration in the method signature
22-106
• Suitable restrictions can reduce hard (unsolvable) problems to
simpler (solvable) problems
– A weaker spec may be easier to check: it is impossible (in general) to
show that pointers are used correctly, but the simple Java requirement
that pointers are initialized before use is simple to enforce.
– A stronger spec may be easier to check: it is impossible (in general) to
show that type errors do not occur at run-time in a dynamically typed
language, but statically typed languages impose stronger restrictions that
are easily checkable.
Restriction: making the problem easier
(c) 2007 Mauro Pezzè & Michal Young
23-106
●
Will the following compile in Java?
Restriction Example
public static void questionable(){
int k;
for (int i=0; i<10;++i){
if (someCondition(i)){
k = 0;
} else {
k+=i;
}
}
}
int k;
if (true == false){
k+=i;
}
Java ALWAYS enforces variable initialization before usage
as the following example shows – this is a case of restriction
But restrictions can be applied at different levels, e.g. at the
architectural level the decision of making the HTTP protocol
stateless hugely simplified testing (and as such made the
protocol more robust)
24-106
• Hard testing and verification problems can be handled by
suitably partitioning the input space:
– both structural (white box) and functional test (black
box) selection criteria identify suitable partitions of code or
specifications (partitions drive the sampling of the input space)
– verification techniques fold the input space according to
specific characteristics, grouping homogeneous data together
and determining partitions
→ Examples of structural (white box) techniques: unit
testing, integration testing, performance testing
→ Examples of functional (black box) techniques: system
testing, acceptance testing, regression testing
Partition: divide and conquer
(c) 2007 Mauro Pezzè & Michal Young
25-106
●
Non-uniform distribution of faults
●
Example: Java class “roots” applies quadratic equation
●
Incomplete implementation logic: Program does not properly handle the
case in which b2 - 4ac = 0 and a = 0
→ Failing values are sparse in the input space — needles in a very big
haystack. Random sampling is unlikely to choose a=0.0 and b=0.0
Partition - Example
These would make good input values for test cases
(c) 2007 Mauro Pezzè & Michal Young
ax
2
+bx+c=0
x=
−b±√b2
−4 ac
2a
26-106
Partition - Example
Failure (valuable test case)
No failure
Failures are sparse
in the space of
possible inputs ...
... but dense in some
parts of the space
If we systematically test some
cases from each part, we will
include the dense parts
Functional testing is one way of
drawing pink lines to isolate
regions with likely failures
Thespaceofpossibleinputvalues
(thehaystack)
(c) 2007 Mauro Pezzè & Michal Young
27-106
●
The ability to measure progress or status
against goals
●
X visibility = ability to judge how we are doing on X, e.g.,
schedule visibility = “Are we ahead or behind schedule,”
quality visibility = “Does quality meet our objectives?”
– Involves setting goals that can be assessed at each
stage of development
●
The biggest challenge is early assessment, e.g., assessing
specifications and design with respect to product quality
●
Related to observability
– Example: Choosing a simple or standard internal data
format to facilitate unit testing
Visibility: Judging status
(c) 2007 Mauro Pezzè & Michal Young
28-106
●
The HTTP Protocol
Visibility - Example
GET /index.html HTTP/1.1
Host: www.google.com
Why wasn't a more efficient binary format selected?
To note HTTP 2.0 will use a binary format
(from https://http2.github.io/faq):
“Binary protocols are more efficient to parse, more compact “on
the wire”, and most importantly, they are much less error-prone,
compared to textual protocols like HTTP/1.x, because they often
have a number of affordances to “help” with things like whitespace
handling, capitalization, line endings, blank links and so on.”
In fact, reduction of visibility is confirmed by
“It’s true that HTTP/2 isn’t usable through telnet, but we already
have some tool support, such as a Wireshark plugin.”
29-106
• Learning from experience: Each project provides
information to improve the next
• Examples
– Checklists are built on the basis of errors revealed in the past
– Error taxonomies can help in building better test selection
criteria
– Design guidelines can avoid common pitfalls
Feedback: tuning the development process
Using a software reliability model fitting past project data
Looking for problematic modules based on prior knowledge
(c) 2007 Mauro Pezzè & Michal Young
30-106
From Requirements to Test Cases
31-106
According to ISO/IEC/IEEE 29148-2011 standard:
●
Correctness: requirements represent the client’s view
●
Completeness: all possible scenarios through the system are
described, including exceptional behavior by the user
●
Consistency: There are functional or nonfunctional requirements
that contradict each other
●
Clarity: There are no ambiguities in the requirements
●
Realism: Requirements can be implemented and delivered
●
Traceability: Each system function can be traced to a
corresponding set of functional requirements
Characteristics of Requirements
32-106
According to IEEE Std 829-1998:
●
Test Case Specification: “A document specifying inputs, predicted
results, and a set of execution conditions for a test item”
Test Cases Definition
33-106
• Functional testing: Deriving test cases from
program specifications
• Functional refers to the source of information used in test case
design, not to what is tested
• Also known as:
– specification-based testing (from specifications)
– black-box testing (no view of the code)
• Functional specification = description of
intended program behavior
– either formal or informal
Functional Testing
(c) 2007 Mauro Pezzè & Michal Young
34-106
• Functional testing uses the specification (formal
or informal) to partition the input space
– E.g., specification of “roots” program suggests division
between cases with zero, one, and two real roots
• Test each category, and boundaries between
categories
– No guarantees, but experience suggests failures often lie at the
boundaries (as in the “roots” program)
Functional testing: exploiting the specification
(c) 2007 Mauro Pezzè & Michal Young
35-106
• The base-line technique for designing test cases
– Timely
• Often useful in refining specifications and assessing testability
before code is written
– Effective
• finds some classes of fault (e.g., missing logic) that can elude
other approaches
– Widely applicable
• to any description of program behavior serving as spec
• at any level of granularity from module to system testing.
– Economical
• typically less expensive to design and execute than structural
(code-based) test cases
Why functional Tests?
(c) 2007 Mauro Pezzè & Michal Young
36-106
• Program code is not necessary
– Only a description of intended behavior is needed
– Even incomplete and informal specifications can be used
• Although precise, complete specifications lead to better test
suites
• Early functional test design has side benefits
– Often reveals ambiguities and inconsistency in spec
– Useful for assessing testability
• And improving test schedule and budget by improving spec
– Useful explanation of specification
• or in the extreme case (as in XP), test cases are the spec
Early Functional Test Design
(c) 2007 Mauro Pezzè & Michal Young
37-106
• Functional test applies at all granularity levels:
– Unit (from module interface spec)
– Integration (from API or subsystem spec)
– System (from system requirements spec)
– Regression (from system requirements + bug history)
• Structural (code-based) test design applies to
relatively small parts of a system:
– Unit
– Integration
• Functional testing is best for missing logic faults
– A common problem: Some program logic was simply forgotten
– Structural (code-based) testing will never focus on code that isn’t
there!
Functional vs structural test: granularity levels
(c) 2007 Mauro Pezzè & Michal Young
38-106
1. Decompose the specification
– If the specification is large, break it into
independently testable features to be
considered in testing
2. Select representatives
– Representative values of each input, or
Representative behaviors of a model
– Often simple input/output transformations
don’t describe a system. We use models in
program specification, in program design, and
in test design
3. Form test specifications
– Typically: combinations of input values, or
model behaviors
4. Produce and execute actual
tests
Steps: from specifications to test cases
(c) 2007 Mauro Pezzè & Michal Young
39-106
Steps: from specifications to test cases: example
Derive Independently Testable Features: identify
features that can be tested separately
Examples: a search functionality on a web application
or addition of new users this may map to different→
levels at the design and code level
NOTE: this helps
also in determining if
there are
requirements that
are not testable or
need to be rewritten
or clarified!
Derive Representative values OR a model that can
be used to derive test cases. Note that this phase is
mostly enumeration of values in isolation. Example:
considering empty list or a one element list as
representative cases
Generation of test case specification based on the
previous step, usually based on the Cartesian product
from the enumeration values (considering feasible
cases). Example: the search functionality,
representative values might be 0,1, many characters
and 0,1, many special characters, but the case
{0,many} is clearly impossible
40-106
Example One: using category partitioning
Using combinatorial testing (category partition) from the
specifications
• We are building a catalogue of computer components in which
customers can select the different parts and assemble their PC for
delivery
• A model identifies a specific product and determines a set of
constraints on available components
• A set of (slot, component) pairs, corresponding to the required and
optional slots of the model. A component might be empty for
optional slots
41-106
Parameter Model
– Model number
– Number of required slots for selected model (#SMRS)
– Number of optional slots for selected model (#SMOS)
Parameter Components
– Correspondence of selection with model slots
– Number of required components with selection ≠ empty
– Required component selection
– Number of optional components with selection ≠ empty
– Optional component selection
Environment element: Product database
– Number of models in database (#DBM)
– Number of components in database (#DBC)
Step 1: Identify independently testable units
(c) 2007 Mauro Pezzè & Michal Young
42-106
Model number
Malformed
Not in database
Valid
Number of required slots for selected model (#SMRS)
0
1
Many
Number of optional slots for selected model (#SMOS)
0
1
Many
Step 2: Identify relevant values: Component (1/3)
(c) 2007 Mauro Pezzè & Michal Young
43-106
Correspondence of selection with model
slots
Omitted slots
Extra slots
Mismatched slots
Complete correspondence
Number of required components with
non empty selection
0
< number required slots
= number required slots
Required component selection
Some defaults
All valid
≥ 1 incompatible with slots
≥ 1 incompatible with another selection
≥ 1 incompatible with model
≥ 1 not in database
Number of optional
components with non empty
selection
0
< #SMOS
= #SMOS
Optional component selection
Some defaults
All valid
≥ 1 incompatible with slots
≥ 1 incompatible with another
selection
≥ 1 incompatible with model
≥ 1 not in database
Step 2: Identify relevant values: Component (2/3)
(c) 2007 Mauro Pezzè & Michal Young
44-106
Number of models in database (#DBM)
0
1
Many
Number of components in database (#DBC)
0
1
Many
Note 0 and 1 are unusual (special) values. They might cause
unanticipated behavior alone or in combination with particular
values of other parameters.
Step 2: Identify relevant values: Component (3/3)
(c) 2007 Mauro Pezzè & Michal Young
45-106
●
A combination of values for each category
corresponds to a test case specification
– in the example we have 314.928 test cases
– most of which are impossible!
●
example
zero slots and at least one incompatible slot
●
Introduce constraints to
– rule out impossible combinations
– reduce the size of the test suite if too large
Step 3: Introduce constraints
(c) 2007 Mauro Pezzè & Michal Young
46-106
[Error] indicates a value class that
– corresponds to a erroneous values
– need be tried only once
Example
Model number: Malformed and Not in database
error value classes
– No need to test all possible combinations of errors
– One test is enough (we assume that handling an error case
bypasses other program logic)
Step 3: error constraint
(c) 2007 Mauro Pezzè & Michal Young
47-106
Model number
Malformed [error]
Not in database [error]
Valid
Correspondence of selection with model slots
Omitted slots [error]
Extra slots [error]
Mismatched slots [error]
Complete correspondence
Number of required comp. with non empty selection
0 [error]
< number of required slots [error]
Required comp. selection
≥ 1 not in database [error]
Number of models in database (#DBM)
0 [error]
Number of components in database (#DBC)
0 [error]
Error constraints
reduce test suite
from 314.928 to
2.711 test cases
Example - Step 3: error constraint
(c) 2007 Mauro Pezzè & Michal Young
48-106
constraint [property] [if-property] rule out invalid combinations
of values
[property] groups values of a single parameter to identify
subsets of values with common properties
[if-property] bounds the choices of values for a category that
can be combined with a particular value selected for a
different category
Example
combine
Number of required comp. with non empty selection = number required slots
[if RSMANY]
only with
Number of required slots for selected model (#SMRS) = Many [Many]
Step 3: property constraints
(c) 2007 Mauro Pezzè & Michal Young
49-106
Number of required slots for selected model (#SMRS)
1 [property RSNE]
Many [property RSNE] [property RSMANY]
Number of optional slots for selected model (#SMOS)
1 [property OSNE]
Many [property OSNE] [property OSMANY]
Number of required comp. with non empty selection
0 [if RSNE] [error]
< number required slots [if RSNE] [error]
= number required slots [if RSMANY]
Number of optional comp. with non empty selection
< number required slots [if OSNE]
= number required slots [if OSMANY]
from 2.711 to
908 test cases
Example - Step 3: property constraints
(c) 2007 Mauro Pezzè & Michal Young
50-106
[single] indicates a value class that test designers
choose to test only once to reduce the number
of test cases
Example
value some default for required component selection
and optional component selection may be tested only
once despite not being an erroneous condition
note single
and error have the same effect but differ in
rationale. Keeping them distinct is important for
documentation and regression testing
Step 3 (cont): single constraints
(c) 2007 Mauro Pezzè & Michal Young
51-106
from 908 to
69 test
cases
Number of required slots for selected model (#SMRS)
0 [single]
1 [property RSNE] [single]
Number of optional slots for selected model (#SMOS)
0 [single]
1 [single] [property OSNE]
Required component selection
Some default [single]
Optional component selection
Some default [single]
Number of models in database (#DBM)
1 [single]
Number of components in database (#DBC)
1 [single]
Example - Step 3: single constraints
(c) 2007 Mauro Pezzè & Michal Young
52-106
Parameter Model
●
Model number
– Malformed [error]
– Not in database [error]
– Valid
●
Number of required slots for selected model (#SMRS)
– 0 [single]
– 1 [property RSNE] [single]
– Many [property RSNE] [property RSMANY]
●
Number of optional slots for selected model (#SMOS)
– 0 [single]
– 1 [property OSNE] [single]
– Many [property OSNE] [property OSMANY]
Environment Product data base
●
Number of models in database (#DBM)
– 0 [error]
– 1 [single]
– Many
●
Number of components in database (#DBC)
– 0 [error]
– 1 [single]
– Many
Parameter Component
●
Correspondence of selection with model slots
– Omitted slots [error]
– Extra slots [error]
– Mismatched slots [error]
– Complete correspondence
●
# of required components (selection  empty)
– 0 [if RSNE] [error]
– < number required slots [if RSNE] [error]
– = number required slots [if RSMANY]
●
Required component selection
– Some defaults [single]
– All valid
≥ 1 incompatible with slots
≥ 1 incompatible with another selection
≥ 1 incompatible with model
≥ 1 not in database [error]
●
# of optional components (selection  empty)
– 0
– < #SMOS [if OSNE]
– = #SMOS [if OSMANY]
●
Optional component selection
– Some defaults [single]
– All valid
≥ 1 incompatible with slots
≥ 1 incompatible with another selection
≥ 1 incompatible with model
≥ 1 not in database [error]
Example - Summary
(c) 2007 Mauro Pezzè & Michal Young
53-106
Example Two: Deriving a model
Maintenance: The Maintenance function records the history of items undergoing
maintenance.
• If the product is covered by warranty or maintenance contract, maintenance can be
requested either by calling the maintenance toll free number, or through the web site, or
by bringing the item to a designated maintenance station.
• If the maintenance is requested by phone or web site and the customer is a US or EU
resident, the item is picked up at the customer site, otherwise, the customer shall ship the
item with an express courier.
• If the maintenance contract number provided by the customer is not valid, the item follows
the procedure for items not covered by warranty.
• If the product is not covered by warranty or maintenance contract, maintenance can be
requested only by bringing the item to a maintenance station. The maintenance station
informs the customer of the estimated costs for repair. Maintenance starts only when the
customer accepts the estimate.
• If the customer does not accept the estimate, the product is returned to the customer.
• Small problems can be repaired directly at the maintenance station. If the maintenance
station cannot solve the problem, the product is sent to the maintenance regional
headquarters (if in US or EU) or to the maintenance main headquarters (otherwise).
• If the maintenance regional headquarters cannot solve the problem, the product is sent to
the maintenance main headquarters.
• Maintenance is suspended if some components are not available.
• Once repaired, the product is returned to the customer.
Multiple choices in the first
step ...
... determine the possibilities
for the next step ...
... and so on ...
From an informal specification:
(c) 2007 Mauro Pezzè & Michal Young
54-106
Example Two: Deriving a model
To a finite state machine:
(c) 2007 Mauro Pezzè & Michal Young
55-106
Example Two: Deriving a model
To a test suite:
(c) 2007 Mauro Pezzè & Michal Young
56-106
Example Two: Deriving a model
Using transition coverage:
Using transition
coverage: Every
transition between
states should be
traversed
by at least one test
case
(c) 2007 Mauro Pezzè & Michal Young
Does history matter? That
is the order in which we
traverse a node influences
the functionality? (e.g. see
wait for completion)
57-106
In the Agile context, the problem of functional testing has been addressed
by having user stories and acceptance tests in collaboration with
customers, constantly updated and runnable
A complementary point of view (1/5)
User Stories
Architectural
Spike
Release
Planning
Iteration
Acceptance
Tests
Small
Releases
Spike
Exploration Phase Planning Phase Iterations to Release Phase Productionizing Phase
requirements
Test scenarios
bugs
next
iteration
latest
version
customer
approval
system
metaphor
uncertain
estimates
confident
estimates
release
plan
eXtreme Programming (XP) process
58-106
A complementary point of view (2/5)
Using Fitnesse to write acceptance tests so that the
customer can actually write the acceptance conditions
for the software
looking at our previous example the “root” case
That we solve by means of
ax2
+bx+c=0
x=
−b±√b2
−4 ac
2a
59-106
A complementary point of view (3/5)
public class Root {
double rootOne, rootTwo;
int numRoots;
public Root (double a, double b, double c){
double q;
double r;
q = b*b - 4 * a *c;
if (q >0 && a != 0){
// if b^2 > 4ac there are two dinstict roots
numRoots = 2;
r = (double) Math.sqrt(q);
rootOne = ((0-b) + r) / (2*a);
rootTwo = ((0-b) - r) / (2*a);
} else if (q==0){ // DEFECT HERE
numRoots = 1;
rootOne = (0-b)/(2*a);
rootTwo = rootOne;
}else {
// equation had no roots if b^2<4ac
numRoots = 0;
rootOne = -1;
rootTwo = -1;
}
}
}
Source code from Mauro Pezzè & Michal Young
60-106
A complementary point of view (4/5)
Our first attempt returns the number of solutions, but the customer did not
want only this – so this is a mistake we would not have captured with unit
tests
The customer also wanted the solutions to the equation, however this
opens other discussions how should we deal with no solutions? What→
about imaginary numbers?
61-106
A complementary point of view (5/5)
Running with a=0 reports the mistake and also opens up a discussion about
the format for returning the solutions and what were the original
requirements in these cases
62-106
From Unit Testing to Integration Testing
63-106
●
On cars, children should not sit on
the front passenger if passenger's
airbag has not been disabled
●
On most cars there is lever to turn
to disable it
Motivating Example
●
However: one cars' manufacturer had trouble with the following scenario
– Airbag turned off by the user
– Car sent for check-up central unit replaced→
– Complete reset of the system reactivated airbags even though lever was
OFF
●
How could have this been detected by testing & which type of
tests?
64-106
Module test Integration test System test
Specification: Module
interface
Interface specs,
module breakdown
Requirements
specification
Visible structure: Coding details Modular structure
(software architecture)
— none —
Scaffolding
required:
Some Often extensive Some
Looking for faults
in:
Modules Interactions,
compatibility
System
functionality
What is integration testing?
(c) 2007 Mauro Pezzè & Michal Young
65-106
• Unit (module) testing is a necessary foundation
– Unit level has maximum controllability and visibility
– Integration testing can never compensate for
inadequate unit testing
• Integration testing may serve as a process check
– If module faults are revealed in integration testing,
they signal inadequate unit testing
– If integration faults occur in interfaces between
correctly implemented modules, the errors can be
traced to module breakdown and interface
specifications
Integration versus Unit Testing
(c) 2007 Mauro Pezzè & Michal Young
66-106
• Inconsistent interpretation of parameters or values
– Example: Mixed units (meters/yards) in Martian Lander
• Violations of value domains, capacity, or size limits
– Example: Buffer overflow
• Side effects on parameters or resources
– Example: Conflict on (unspecified) temporary file
• Omitted or misunderstood functionality
– Example: Inconsistent interpretation of web hits
• Nonfunctional properties
– Example: Unanticipated performance issues
• Dynamic mismatches
– Example: Incompatible polymorphic method calls
Integration Faults
(c) 2007 Mauro Pezzè & Michal Young
67-106
Static void ssl_io_filter_disable(ap_filter_t *f)
{ bio_filter_in_ctx_t *inctx = f->ctx;
inctx->ssl = NULL;
inctx->filter ctx->pssl = NULL;
}
Apache web server, version 2.0.48
Response to normal page request on secure (https) port
Example: A Memory Leak
No obvious error, but Apache
leaked memory slowly (in
normal use) or quickly (if
exploited for a DOS attack)
(c) 2007 Mauro Pezzè & Michal Young
68-106
Static void ssl_io_filter_disable(ap_filter_t *f)
{ bio_filter_in_ctx_t *inctx = f->ctx;
SSL_free(inctx -> ssl);
inctx->ssl = NULL;
inctx->filter ctx->pssl = NULL;
}
Apache web server, version 2.0.48
Response to normal page request on secure (https) port
The missing code is for a structure
defined and created elsewhere,
accessed through an opaque pointer.
Example: A Memory Leak
(c) 2007 Mauro Pezzè & Michal Young
69-106
Static void ssl_io_filter_disable(ap_filter_t *f)
{ bio_filter_in_ctx_t *inctx = f->ctx;
SSL_free(inctx -> ssl);
inctx->ssl = NULL;
inctx->filter ctx->pssl = NULL;
}
Apache web server, version 2.0.48
Response to normal page request on secure (https) port
Almost impossible to find with unit
testing. (Inspection and some
dynamic techniques could have found
it.)
Example: A Memory Leak
(c) 2007 Mauro Pezzè & Michal Young
70-106
• Yes, I implemented
module A , but I didn’t⟨ ⟩
test it thoroughly yet. It
will be tested along with
module B when that’s⟨ ⟩
ready.
Maybe you have heard...
(c) 2007 Mauro Pezzè & Michal Young
71-106
• Yes, I implemented
module A , but I didn’t⟨ ⟩
test it thoroughly yet. It
will be tested along with
module B when that’s⟨ ⟩
ready.
• I didn’t think at all about
the strategy for testing.
I didn’t design module⟨
A for testability and I⟩
didn’t think about the
best order to build and
test modules A and B .⟨ ⟩ ⟨ ⟩
(c) 2007 Mauro Pezzè & Michal Young
Translation...
72-106
An extreme and desperate approach:
Test only after integrating all modules
+Does not require scaffolding
• The only excuse, and a bad one
- Minimum observability, diagnosability, efficacy,
feedback
- High cost of repair
• Recall: Cost of repairing a fault rises as a function of
time between error and repair
(c) 2007 Mauro Pezzè & Michal Young
Big Bang Integration Test
73-106
• Structural orientation:
Modules constructed, integrated and tested
based on a hierarchical project structure
– Top-down, Bottom-up, Sandwich, Backbone
• Functional orientation:
Modules integrated according to application
characteristics or features
– Threads, Critical module
(c) 2007 Mauro Pezzè & Michal Young
Structural and Functional Strategies
74-106
Testing Framework
Class A Stub B
Stub X Stub Y
Testing Framework
Class A Class B
Stub X Stub Y
Testing Framework
Class A Class B
Class X Class Y
●
Working from the top level (in terms of “use”
or “include” relation) toward the bottom.
●
No drivers required if program tested from
top-level interface (e.g. GUI, CLI, web app,
etc.)
●
Write stubs of called or used modules at each
step in construction
●
As modules replace stubs, more functionality
is testable
●
...until the program is complete, and all
functionality can be tested
(c) 2007 Mauro Pezzè & Michal Young
Top-Down
75-106
Testing Framework
Driver
Class X
Testing Framework
Driver Driver
Class X Class Y
Testing Framework
Class A Class B
Class X Class Y
●
Starting at the leaves of the “uses”
hierarchy, we never need stubs
●
... but we must construct drivers for each
module (as in unit testing) …
●
... an intermediate module replaces a
driver, and needs its own driver
●
so we may have several working
subsystems that are eventually integrated
into a single system.
(c) 2007 Mauro Pezzè & Michal Young
Bottom-Up .
76-106
●
Working from the extremes
(top and bottom) toward
center, we may use fewer
drivers and stubs
●
Sandwich integration is
flexible and adaptable, but
complex to plan
(c) 2007 Mauro Pezzè & Michal Young
Sandwich
Top (parts)
Stub A Class B
Class Y
Top (parts)
Class A Class B
Class YClass Y
77-106
●
A “thread” is a portion of several modules that
together provide a user-visible program
feature.
●
Integrating one thread, then another, etc., we
maximize visibility for the user
●
As in sandwich integration testing, we can
minimize stubs and drivers, but the integration
plan may be complex
(c) 2007 Mauro Pezzè & Michal Young
Thread
Driver
Class A Class B
Class Y
Driver
Class A Class B
Class Y
Class C
Class Z
78-106
• Strategy: Start with riskiest modules
– Risk assessment is necessary first step
– May include technical risks (is X feasible?), process
risks (is schedule for X realistic?), other risks
• May resemble thread or sandwich process in
tactics for flexible build order
– E.g., constructing parts of one module to test
functionality in another
• Key point is risk-oriented process
– Integration testing as a risk-reduction activity,
designed to deliver any bad news as early as possible
(c) 2007 Mauro Pezzè & Michal Young
Critical Modules
79-106
• Functional strategies require more planning
– Structural strategies (bottom up, top down,
sandwich) are simpler
– But thread and critical modules testing provide
better process visibility, especially in complex
systems
• Possible to combine
– Top-down, bottom-up, or sandwich are reasonable
for relatively small components and subsystems
– Combinations of thread and critical modules
integration testing are often preferred for larger
subsystems
(c) 2007 Mauro Pezzè & Michal Young
Choosing a Strategy
80-106
Specific Issues in Testing Object
Oriented Software
81-106
●
Procedural software
– unit = single program, function, or procedure
more often: a unit of work that may correspond to one or more
intertwined functions or programs
●
Object oriented software
– unit = class or (small) cluster of strongly related classes
(e.g., sets of Java classes that correspond to exceptions)
– unit testing = intra-class testing
– integration testing = inter-class testing (cluster of classes)
→ dealing with single methods separately is usually too expensive
(complex scaffolding), so methods are usually tested in the
context of the class they belong to
Ch 15, slide 81
OO definitions of unit and integration testing
(c) 2007 Mauro Pezzè & Michal Young
82-106
• The Unit in Unit Testing is usually a class, however,
there are specific issues that need to be taken into
account when considering OO:
– State dependent behavior
– Encapsulation
– Inheritance
– Polymorphism and dynamic binding
– Abstract and generic classes
– Exception handling
“Unit” in Unit Testing
(c) 2007 Mauro Pezzè & Michal Young
83-106
 abstract class Credit { 
...
   abstract boolean validateCredit( Account a, int amt, CreditCard c); 
...
}
USAccount
UKAccount
EUAccount
JPAccount
OtherAccount
EduCredit
BizCredit
IndividualCredit
VISACard
AmExpCard
StoreCard
The combinatorial problem: 3 x 5 x 3 = 45 possible combinations
of dynamic bindings (just for this one method!)
“Isolated” calls: the combinatorial explosion problem
(c) 2007 Mauro Pezzè & Michal Young
84-106
Account Credit creditCard
USAccount EduCredit VISACard
USAccount BizCredit AmExpCard
USAccount individualCredit ChipmunkCard
UKAccount EduCredit AmExpCard
UKAccount BizCredit VISACard
UKAccount individualCredit ChipmunkCard
EUAccount EduCredit ChipmunkCard
EUAccount BizCredit AmExpCard
EUAccount individualCredit VISACard
JPAccount EduCredit VISACard
JPAccount BizCredit ChipmunkCard
JPAccount individualCredit AmExpCard
OtherAccount EduCredit ChipmunkCard
OtherAccount BizCredit VISACard
OtherAccount individualCredit AmExpCard
Identify a set of
combinations that cover
all pairwise combinations
of dynamic bindings
The combinatorial approach
(c) 2007 Mauro Pezzè & Michal Young
85-106
public abstract class Account { ...
public int getYTDPurchased() {
if (ytdPurchasedValid) { return ytdPurchased; }
int totalPurchased = 0;
for (Enumeration e = subsidiaries.elements() ;
e.hasMoreElements(); )
{ Account subsidiary = (Account) e.nextElement();
totalPurchased += subsidiary.getYTDPurchased();
}
for (Enumeration e = customers.elements();
e.hasMoreElements(); )
{ Customer aCust = (Customer) e.nextElement();
totalPurchased += aCust.getYearlyPurchase();
}
ytdPurchased = totalPurchased;
ytdPurchasedValid = true;
return totalPurchased;
} … }
Problem:
different implementations of
methods getYDTPurchased
refer to different currencies.
Combined calls: undesired effects
(c) 2007 Mauro Pezzè & Michal Young
86-106
public abstract class Account {
...
public int getYTDPurchased() {
if (ytdPurchasedValid) { return ytdPurchased; }
int totalPurchased = 0;
for (Enumeration e = subsidiaries.elements() ; e.hasMoreElements(); )
{
Account subsidiary = (Account) e.nextElement();
totalPurchased += subsidiary.getYTDPurchased();
}
for (Enumeration e = customers.elements(); e.hasMoreElements(); )
{
Customer aCust = (Customer) e.nextElement();
totalPurchased += aCust.getYearlyPurchase();
}
ytdPurchased = totalPurchased;
ytdPurchasedValid = true;
return totalPurchased;
}
…
}
step 1: identify polymorphic
calls, binding sets, defs and
uses
totalPurchased
used and defined
totalPurchased
used and defined
totalPurchased defined
totalPurchased usedtotalPurchased used
A Data Flow Approach
(c) 2007 Mauro Pezzè & Michal Young
87-106
●
Derive a test case for each possible
polymorphic <def,use> pair
– Each binding must be considered individually
– Pairwise combinatorial selection may help in reducing the set
of test cases
● Example: Dynamic binding of currency
– We need test cases that bind the different calls to different
methods in the same run
– We can reveal faults due to the use of different currencies in
different methods
Def-Use (dataflow) testing of polymorphic calls
(c) 2007 Mauro Pezzè & Michal Young
88-106
●
When testing a subclass ...
– We would like to re-test only what has not been thoroughly
tested in the parent class
●
for example, no need to test hashCode and getClass methods
inherited from class Object in Java
– But we should test any method whose behavior may have
changed
●
even accidentally!
Inheritance
(c) 2007 Mauro Pezzè & Michal Young
89-106
●
Track test suites and test executions
– determine which new tests are needed
– determine which old tests must be re-executed
●
New and changed behavior ...
– new methods must be tested
– redefined methods must be tested, but we can partially reuse
test suites defined for the ancestor
– other inherited methods do not have to be retested
Reusing Tests with the Testing History Approach
(c) 2007 Mauro Pezzè & Michal Young
90-106
Testing history
(c) 2007 Mauro Pezzè & Michal Young
91-106
Inherited, unchanged
(c) 2007 Mauro Pezzè & Michal Young
92-106
Newly introduced methods
(c) 2007 Mauro Pezzè & Michal Young
93-106
Overridden methods
(c) 2007 Mauro Pezzè & Michal Young
94-106
●
Abstract methods (and classes)
– Design test cases when abstract method is introduced (even if
it can’t be executed yet)
●
Behavior changes
– Should we consider a method “redefined” if another new or
redefined method changes its behavior?
●
The standard “testing history” approach does not do this
●
It might be reasonable combination of data flow (structural) OO
testing with the (functional) testing history approach
Testing history – some details
(c) 2007 Mauro Pezzè & Michal Young
95-106
Testing History - Summary
(c) 2007 Mauro Pezzè & Michal Young
96-106
●
Executing test cases should (usually) be cheap
– It may be simpler to re-execute the full test suite of the
parent class
– ... but still add to it for the same reasons
●
But sometimes execution is not cheap ...
– Example: Control of physical devices
– Or very large test suites
●
Ex: Some Microsoft product test suites require more than one
night (so daily build cannot be fully tested)
– Then some use of testing history is profitable
Does Testing History help?
(c) 2007 Mauro Pezzè & Michal Young
97-106
A generic class
class PriorityQueue<Elem Implements Comparable> {...}
is designed to be instantiated with many different parameter types
PriorityQueue<Customers>
PriorityQueue<Tasks>
A generic class is typically designed to behave consistently
some set of permitted parameter types.
Testing can be broken into two parts
– Showing that some instantiation is correct
– showing that all permitted instantiations behave consistently
Testing Generic Classes
(c) 2007 Mauro Pezzè & Michal Young
98-106
●
Design tests as if the parameter were copied
textually into the body of the generic class.
– We need source code for both the generic class and the
parameter class
Show that some instantiation is correct
(c) 2007 Mauro Pezzè & Michal Young
99-106
●
Identify potential interactions between generic
and its parameters
– Identify potential interactions by inspection or analysis, not
testing
– Look for: method calls on parameter object, access to
parameter fields, possible indirect dependence
– Easy case is no interactions at all (e.g., a simple container
class)
●
Where interactions are possible, they will need
to be tested
Identify (possible) interactions
(c) 2007 Mauro Pezzè & Michal Young
100-106
class PriorityQueue
<Elem implements Comparable> {...}
●
Priority queue uses the “Comparable” interface
of Elem to make method calls on the generic
parameter
●
We need to establish that it does so
consistently
– So that if priority queue works for one kind of Comparable
element, we can have some confidence it does so for others
Example Interaction
(c) 2007 Mauro Pezzè & Michal Young
101-106
●
We can’t test every possible instantiation
– Just as we can’t test every possible program input
●
... but there is a contract (a specification)
between the generic class and its parameters
– Example: “implements Comparable” is a specification of
possible instantiations
– Other contracts may be written only as comments
●
Functional (specification-based) testing
techniques are appropriate
– Identify and then systematically test properties implied by the
specification
Testing variation in instantiation
(c) 2007 Mauro Pezzè & Michal Young
102-106
Most but not all classes that implement Comparable also satisfy the
rule
(x.compareTo(y) == 0) == (x.equals(y))
(from java.lang.Comparable)
So test cases for PriorityQueue should include
●
instantiations with classes that do obey this rule:
class String
●
instantiations that violate the rule:
class BigDecimal with values 4.0 and 4.00
Example: Testing variation in instantiation
(c) 2007 Mauro Pezzè & Michal Young
103-106
void addCustomer(Customer theCust) {
customers.add(theCust);
}
public static Account
newAccount(...)
throws InvalidRegionException
{
Account thisAccount = null;
String regionAbbrev = Regions.regionOfCountry(
mailAddress.getCountry());
if (regionAbbrev == Regions.US) {
thisAccount = new USAccount();
} else if (regionAbbrev == Regions.UK) {
....
} else if (regionAbbrev == Regions.Invalid) {
throw new InvalidRegionException(mailAddress.getCountry());
}
...
}
exceptions
create implicit
control flows
and may be
handled by
different
handlers
Exception handling
(c) 2007 Mauro Pezzè & Michal Young
104-106
●
Impractical to treat exceptions like normal
flow
●
too many flows: every array subscript reference, every memory
allocation, every cast, ...
●
multiplied by matching them to every handler that could appear
immediately above them on the call stack.
●
many actually impossible
●
So we separate testing exceptions
●
and ignore program error exceptions (test to prevent them, not
to handle them)
●
What we do test: Each exception handler, and
each explicit throw or re-throw of an exception
Testing Exception Handling
(c) 2007 Mauro Pezzè & Michal Young
105-106
●
Local exception handlers
– test the exception handler (consider a subset of points bound
to the handler)
●
Non-local exception handlers
– Difficult to determine all pairings of <points, handlers>
– So enforce (and test for) a design rule:
if a method propagates an exception, the method call should
have no other effect
Testing program exception handlers
(c) 2007 Mauro Pezzè & Michal Young
106-106
Most of the source code examples, class diagrams, etc... from [2] if not
differently stated
[1] A. Zeller, Why Programs Fail, Second Edition: A Guide to Systematic
Debugging, 2 edition. Amsterdam ; Boston: Morgan Kaufmann, 2009. 
[2] M. Pezzè and M. Young, Software Testing And Analysis: Process,
Principles And Techniques. Hoboken, N.J.: John Wiley & Sons Inc, 2007.
Acceptance Testing example using Fitnesse (www.fitnesse.org)
References