Вы находитесь на странице: 1из 26

Jagan Mohan

TESTING TYPES AND THEIR METHODOLOGIES


Jagan Mohan

Index

1. Unit Testing
2. Integration Testing
3. System Testing
4. Regression Testing
5. Load Testing
6. Performance Testing
7. Stress Testing
8. Model Based Testing
9. Installation Testing
10. Usability Testing
11. User Acceptance Testing
12. Black Box Testing
13. White Box Testing
14. Glass Box Testing
15. Fuzz Testing
Jagan Mohan

Unit Testing
In computer programming, a unit test is a procedure used to verify that a particular module of
source code is working properly. The idea about unit tests is to write test cases for all functions
and methods so that whenever a change causes a regression, it can be quickly identified and
fixed. Ideally, each test case is separate from the others; constructs such as mock objects can
assist in separating unit tests. This type of testing is mostly done by the developers and not by
end-users.

Benefits

The goal of unit testing is to isolate each part of the program and show that the individual parts
are correct. Unit testing provides a strict, written contract that the piece of code must satisfy. As a
result, it affords several benefits.

Facilitates change

Unit testing allows the programmer to refactor code at a later date, and make sure the module still
works correctly (i.e. regression testing). This provides the benefit of encouraging programmers to
make changes to the code since it is easy for the programmer to check if the piece is still working
properly.

Simplifies integration

Unit testing helps eliminate uncertainty in the pieces themselves and can be used in a bottom-up
testing style approach. By testing the parts of a program first and then testing the sum of its parts,
integration testing becomes much easier.

Documentation

Unit testing provides a sort of "living document". Clients and other developers looking to learn
how to use the class can look at the unit tests to determine how to use the class to fit their needs
and gain a basic understanding of the API.

Separation of interface from implementation

Because some classes may have references to other classes, testing a class can frequently spill
over into testing another class. A common example of this is classes that depend on a database:
Jagan Mohan

in order to test the class, the tester often writes code that interacts with the database. This is a
mistake, because a unit test should never go outside of its own class boundary. As a result, the
software developer abstracts an interface around the database connection, and then implements
that interface with their own mock object. This results in loosely coupled code, minimizing
dependencies in the system.

Limitations

Unit-testing will not catch every error in the program. By definition, it only tests the functionality of
the units themselves. Therefore, it will not catch integration errors, performance problems and
any other system-wide issues. In addition, it may not be easy to anticipate all special cases of
input the program unit under study may receive in reality. Unit testing is only effective if it is used
in conjunction with other software testing activities.

It is unrealistic to test all possible input combinations for any non-trivial piece of software. A unit
test can only show the presence of errors; it cannot show the absence of errors.

Applications

Extreme Programming

While Unit Testing is often associated with Extreme Programming, it existed a long time before
Extreme Programming was invented. The unit testing concept is part of the Extreme
Programming method of software engineering. Various unit testing frameworks, based on a
design by Kent Beck, have come to be known collectively as xUnit, and are available for many
programming languages and development platforms. Unit testing is the building block to test
driven development (TDD). Extreme Programming and most other methods use unit tests to
perform black box testing.

Note that many in the Extreme Programming community favor the terms "developer testing" or
"programmer testing" over the term "unit testing," since many other test activities (like function or
acceptance test) can now be done at "developer-time."

Techniques

Conventionally and as a well accepted industry practice, unit testing is conducted in an


automated environment through the use of a third party supplied component or framework.
However, one reputable organization, the IEEE, prescribes neither an automated nor manual
approach. This suggests the option to conduct unit testing by following a step-by-step
Jagan Mohan

instructional document. Nevertheless, the objective in unit testing is to isolate a unit and verify its
correctness. Automation is much more efficient and enables the many benefits listed in this
article. In fact, manual unit testing is arguably a form of integration testing and thus precludes the
achievement of most (if not all) of the goals established for unit testing.

To fully realize the effect of isolation, the unit or code body subjected to the unit test is executed
within a framework outside of its natural environment, that is, outside of the product or calling
context for which it was originally created. Testing the unit outside of its product context ensures
that testing is done without any dependencies between the code being tested and other units or
data spaces in the product and that might affect test results.

Consequently, unit testing is traditionally a motivator for programmers to create decoupled and
cohesive code bodies. This practice promotes healthy habits in software development and in turn
supports the software applicable -ilities of system engineering. Design patterns and unit testing
often go hand-in-hand.

Integration testing

Integration testing (sometimes called Integration and testing and abbreviated I&T) is the
phase of software testing in which individual software modules are combined and tested as a
group. It follows unit testing and precedes system testing.

Integration testing takes as its input modules that have been checked out by unit testing, groups
them in larger aggregates, applies tests defined in an Integration test plan to those aggregates,
and delivers as its output the integrated system ready for system testing.

Purpose

The purpose of Integration testing is to verify functional, performance and reliability


requirements placed on major design items. These "design items", i.e. assemblages (or groups of
units), are exercised through their interfaces using Black box testing, success and error cases
being simulated via appropriate parameter and data inputs. Simulated usage of shared data
areas and inter-process communication is tested, individual subsystems are exercised through
their input interface. All test cases are constructed to test that all components within assemblages
interact correctly, for example, across procedure calls or process activations.
Jagan Mohan

The overall idea is a "building block" approach, in which verified assemblages are added to a
verified base which is then used to support the Integration testing of further assemblages.

The different types integration testing are Big Bang, Top Down , Bottom Up and Back bone.

System testing

System testing is testing conducted on a complete, integrated system to evaluate the system's
compliance with its specified requirements. System testing falls within the scope of Black box
testing, and as such, should require no knowledge of the inner design of the code or logic.

Alpha testing and Beta testing are sub-categories of System testing.

As a rule, System testing takes, as its input, all of the "integrated" software components that have
successfully passed Integration testing and also the software system itself integrated with any
applicable hardware system(s). The purpose of Integration testing is to detect any inconsistencies
between the software units that are integrated together called assemblages or between any of the
assemblages and hardware. System testing is more of a limiting type of testing, where it seeks to
detect both defects within the "inter-assemblages" and also the system as a whole.

Testing the whole system

Generally speaking, System testing is the first time that the entire system can be tested against
the Functional Requirement Specification(s) (FRS) and/or the System Requirement Specification
(SRS), these are the rules that describe the functionality that the vendor (the entity developing the
software) and a customer have agreed upon. System testing tends to be more of an investigatory
testing phase, where the focus is to have almost a destructive attitude and test not only the
design, but also the behavior and even the believed expectations of the customer. System testing
is intended to test up to and beyond the bounds defined in the software/hardware requirements
specification(s).

One could view System testing as the final destructive testing phase before Acceptance testing.

[edit]

Types of testing
Jagan Mohan

The following examples are different types of testing, that should be considered during System
testing:

• Functional testing
• User interface testing
• Model based testing
• Error exit testing
• User help testing
• Security Testing
• Capacity testing
• Performance testing
• Sanity testing
• Regression testing
• Reliability testing
• Recovery testing
• Installation testing
• Maintenance testing
• Documentation testing

Although different testing organizations may prescribe different tests, within System testing, this
list serves as a general framework, or foundation to begin with.

Regression testing

Regression testing is any type of software testing which seeks to uncover regression bugs.
Regression bugs occur whenever software functionality that previously worked as desired stops
working or no longer works in the same way that was previously planned. Typically regression
bugs occur as an unintended consequence of program changes.

Common methods of regression testing include re-running previously run tests and checking
whether previously-fixed faults have reemerged.

Experience has shown that as software is developed, this kind of reemergence of faults is quite
common. Sometimes it occurs because a fix gets lost through poor revision control practices (or
simple human error in revision control), but just as often a fix for a problem will be "fragile" - if
Jagan Mohan

some other change is made to the program, the fix no longer works. Finally, it has often been the
case that when some feature is redesigned, the same mistakes will be made in the redesign that
were made in the original implementation of the feature.

Therefore, in most software development situations it is considered good practice that when a
bug is located and fixed, a test that exposes the bug is recorded and regularly retested after
subsequent changes to the program. Although this may be done through manual testing
procedures using programming techniques, it is often done using automated testing tools. Such a
'test suite' contains software tools that allows the testing environment to execute all the
regression test cases automatically; some projects even set up automated systems to
automatically re-run all regression tests at specified intervals and report any regressions.
Common strategies are to run such a system after every successful compile (for small projects),
every night, or once a week.

Regression testing is an integral part of the extreme programming software development


methodology. In this methodology, design documents are replaced by extensive, repeatable, and
automated testing of the entire software package at every stage in the software development
cycle.

Load testing

Load testing is the act of testing a system under load.

In mechanical systems it refers to the testing of a system to certify it under the appropriate
regulations (in the UK LOLER - Lifting Operations and Lifting Equipment Regulations)

Load testing is usually carried out to a load 1.5x the SWL (Safe Working Load) periodic
recertification is required.

In software engineering it is a blanket term that is used in many different ways across the
professional software testing community.

Load testing generally refers to the practice of modeling the expected usage of a software
program by simulating multiple users accessing the program's services concurrently. As such,
this testing is most relevant for multi-user systems, often one built using a client/server model,
such as web servers. However, other types of software systems can be load-tested also. For
Jagan Mohan

example, a word processor or graphics editor can be forced to read an extremely large document;
or a financial package can be forced to generate a report based on several years' worth of data.

When the load placed on the system is raised beyond normal usage patterns, in order to test the
system's response at unusually high or peak loads, it is known as stress testing. The load is
usually so great that error conditions are the expected result, although there is a gray area
between the two domains and no clear boundary exists when an activity ceases to be a load test
and becomes a stress test.

There is little agreement on what the specific goals of load testing are. The term is often used
synonymously with performance testing, reliability testing, and volume testing.

Performance testing

In software engineering, performance testing is testing that is performed to determine how fast
some aspect of a system performs under a particular workload.

Performance testing is a subset of performance engineering, an emerging computer science


practice which strives to build performance into the design and architecture of a system, prior to
the onset of actual coding effort.

Performance testing can serve different purposes. It can demonstrate that the system meets
performance criteria. It can compare two systems to find which performs better. Or it can
measure what parts of the system or workload cause the system to perform badly. In the
diagnostic case, software engineers use tools such as profilers to measure what parts of a device
or software contribute most to the poor performance or to establish throughput levels (and
thresholds) for maintained acceptable response time. It is critical to the cost performance of a
new system, that performance test efforts begin at the inception of the development project and
extend through to deployment. The later a performance defect is detected, the higher the cost of
remediation. This is true in the case of functional testing, but even more so with performance
testing, due to the end-to-end nature of its scope.

It is important (and often difficult to arrange) for the performance test conditions to be similar to
the expected actual use.

Testing technology
Jagan Mohan

Performance testing technology employs one or more PCs to act as injectors – each emulating
the presence or numbers of users and each running an automated sequence of interactions
(recorded as a script, or as a series of scripts to emulate different types of user interaction) with
the host whose performance is being tested. Usually, a separate PC acts as a test conductor,
coordinating and gathering metrics from each of the injectors and collating performance data for
reporting purposes. The usual sequence is to ramp up the load – starting with a small number of
virtual users and increasing the number over a period to some maximum. The test result shows
how the performance varies with the load, given as number of users vs response time. Various
tools are available to perform such tests. Tools in this category usually execute a suite of tests
which will emulate real users against the system. Sometimes the results can reveal oddities, e.g.,
that while the average response time might be acceptable, there are outliers of a few key
transactions that take considerably longer to complete – something that might be caused by
inefficient database queries, etc.

Performance testing can be combined with stress testing, in order to see what happens when an
acceptable load is exceeded — does the system crash? How long does it take to recover after a
large load is reduced? Does it fail in a way that causes collateral damage?

Performance specifications

It is critical to detail performance specifications (requirements) and document them in any


performance test plan. Ideally, this is done during the requirements development phase of any
system development project, prior to any design effort.

Except in real-time computing, performance testing is frequently not performed against a


specification. I.e. no one has expressed the maximum acceptable response time for a given
population of users.

Performance testing is often used as part of the process of performance profile tuning. The goal
is to identify the "weakest links" — there are often a small number of parts in the system which, if
they are made faster, will result in the overall system running noticeably faster. It is sometimes
difficult to identify which parts of the system represent the critical paths. To help identify critical
paths some test tools include (or have as add-ons) instrumentation agents that run on the server
and report transaction times, database access times, network overhead, and other server
monitors. Without such instrumentation the use of primitive system tools may be required (e.g.
Task Manager in Microsoft Windows).

Performance testing usually concludes that it is the software (rather than hardware) that
contribute most to delays (bottlenecks) in processing data.
Jagan Mohan

Performance testing can be performed across the web, and even done in different parts of the
country, since the response times of the internet itself vary regionally. It can also be done in-
house, although routers would then need to be configured to introduce the lag what would
typically occur on public networks. Loads should be introduced to the system from realistic points.
For example, if 50% of a system's user base will be accessing the system via a 56K modem
connection and the other half over a T1, then the load injectors (computers that simulate real
users) should either inject load over the same connections (ideal) or simulate the network latency
of such connections, following the same user profile.

It is always helpful to have a statement of the likely peak numbers of users that might be
expected to use the system at peak times. If there can also be a statement of what constitutes the
maximum allowable 95 percentile response time, then an injector configuration could be used to
test whether the proposed system met that specification.

Performance specifcations (requirements) should ask the following questions, at a minimum:

• In detail, what is the performance test scope? What subsystems, interfaces, components,
etc are in and out of scope for this test?
• For the user interfaces (UI's) involved, how many concurrent users are expected for each
(specify peak vs. nominal)?
• What does the target system (hardware) look like (specify all server and network
appliance configurations)?
• What is the Application Workload Mix of each application component? (for example: 20%
login, 40% search, 30% item select, 10% checkout).
• What is the System Workload Mix? [Multiple worloads may be simulated in a single
performance test] (for example: 30% Workload A, 20% Workload B, 50% Workload C)
• What are the time requirements for any/all backend batch processes (specify peak vs.
nominal)?

Tasks to undertake

Tasks to perform such a test would include:

• Decide whether to use internal or external resources to perform the tests, depending on
inhouse expertise (or lack thereof)
• Gather or ilicit performance requirements (specifications) from users and/or business
analysts
Jagan Mohan

• Develop a high-level plan (or project charter), including requirements, resources,


timelines and milestones
• Develop of a detailed performance test plan (including detailed scenarios and test cases,
workloads, environment info, etc)
• Choose test tool(s)
• Specify test data needed and charter effort (often overlooked, but often the death of a
valid performance test)
• Develop proof-of-concept scripts for each application/component under test, using
chosen test tools and strategies
• Develop detailed performance test project plan, including all dependencies and
associated timelines
• Install and configure injectors/controller
• Configure the test environment (ideally identical hardware to the production platform),
router configuration, quiet network (we don’t want results upset by other users),
deployment of server instrumentation, database test sets developed, etc.
• Execute tests – probably repeatedly (iteratively) in order to see whether any unaccounted
for factor might affect the results
• Analyze the results - either pass/fail, or investigation of critical path and recommendation
of corrective action

Stress testing

Stress testing is a form of testing that is used to determine the stability of a given system or
entity. It involves testing beyond normal operational capacity, often to a breaking point, in order to
observe the results. For example, a web server may be stress tested using scripts, bots, and
various denial of service tools to observe the performance of a web site during peak loads. Stress
testing is a subset of load testing.

Model-based testing

Model-based testing refers to software testing where test cases are derived in whole or in part
from a model that describes some (usually functional) aspects of the system under test (SUT).
Jagan Mohan

General model-based testing setting

The model is usually an abstract, partial presentation of the system under test's desired behavior.
The test cases derived from this model are functional tests on the same level of abstraction as
the model. These test cases are collectively known as the abstract test suite. The abstract test
suite cannot be directly executed against the system under test because it is on the wrong level
of abstraction. Therefore an executable test suite must be derived from the abstract test suite that
can communicate with the system under test. This is done by mapping the abstract test cases to
concrete test cases suitable for execution.

There are many different ways to "derive" tests from a model. Because testing is usually
experimental and based on heuristics, there is no one best way to do this. It is common to
consolidate all test derivation related design decisions into a package that is often known as "test
requirements", "test purpose" or even "use case". This package can contain e.g. information
about the part of the model that should be the focus for testing, or about the conditions where it is
correct to stop testing (test stopping criteria).

An example of a model-based testing workflow. IXIT refers to "implementation extra information"


and denotes here the total package of information that is needed when the abstract test suite is
Jagan Mohan

converted into an executable one. Typically it includes information about test harness, data
mappings and SUT configuration.

Because test suites are derived from models and not from source code, model-based testing is
usually seen as one form of black-box testing. In some aspects, this is not completely accurate.
Model-based testing can be combined with source-code level test coverage measurement, and
functional models can be based on existing source code in the first place.

Model-based testing for complex software systems is still an evolving field.

Models

Especially in model-driven architecture the model is build before or parallel to the development
process of the system under test. The model can also be constructed from the completed system.
Recently the model is created mostly manually, but there are also attempts to create the model
automatically, for instance out of the source code.

Deriving Tests Algorithmically

The effectiveness of model-based testing is primarily due to the potential for automation it offers.
If the model is machine-readable and formal to the extent that it has a well-defined behavioral
interpretation, test cases can in principle be derived mechanically.

Often the model is translated to or interpreted as a finite state automaton or a state transition
system. This automaton represents the possible configurations of the system under test. To find
test cases, the automaton is searched for executable paths. A possible execution path can serve
as a test case. This method works if the model is deterministic or can be transformed into a
deterministic one.

Depending on the complexity of the system under test and the corresponding model the number
of paths can be very large, because of the huge amount of possible configurations of the system.
For finding appropriate test cases, i.e. paths that refer to a certain requirement to proof, the
search of the paths has to be guided. For the test case selection multiple techniques are applied.

Test Case Generation by Theorem Proving

Theorem proving has been originally used for automated proving of logical formulas. For model-
based testing approaches the system is modelled by a set of logical expressions (predicates)
specifying the system's behavior. For selecting test cases the model is partitioned into
Jagan Mohan

equivalence classes over the valid interpretation of the set of the logical expressions describing
the system under test. Each class is representing a certain system behavior and can therefore
serve as a test case.

The simpliest partitioning is done by the disjunctive normal form approach. The logical
expressions describing the system's behavior are transformed into the disjunctive normal form.

The classification tree-method provides a more sophisticated hierarchical partitioning. Also


partitioning heuristics are used supporting the partitioning algorithms, e.g. heuristics based on
boundary value analysis.

Test Case Generation by Constraint Logic Programming

Constraint programming can be used to select test cases satisfying specific contraints by solving
a set of contraints over a set of variables. The system is described by the means of contraints.
Solving the set of constraints can be done by boolean solvers (e.g. SAT-solvers based on the
boolean satisfiability problem) or by numerical analysis, like the Gaussian elimination. A solution
found by solving the set of constraints formulas can serve as an test cases for the corresponding
system.

Test Case Generation by Model Checking

Originally model checking was developed as a technique to check if a property of a specification


is valid in a model. We provide a model of the system under test and a property we want to test to
the model checker. Within the procedure of proofing if this property is valid in the model the
model checker detects witnesses and counterexamples. A witness is a path, where the property
is satisfied, a counterexample is a path in the execution of the model, where the property is
violated. This paths can again be used as test cases.

Test Case Generation by Symbolic Execution

Symbolic execution is often used in frameworks for model-based testing. It can be a means in
searching for execution traces in an abstract model. In principle the program execution is
simulated using symbols for variables rather than actual values. Then the program can be
executed in a symbolic way. Each execution path represents one possible program execution and
can be used as a test case. For that, the symbols have to instantiated by assigning values to the
symbols.
Jagan Mohan

Installation testing

Installation testing (in software engineering) can simply be defined as any testing that occurs
outside of the development environment. Such testing will frequently occur on the computer
system the software product will eventually be installed on.

While the ideal installation might simply appear to be to run a setup program, the generation of
that setup program itself and its efficacy in a variety of machine and operating system
environments can require extensive testing before it can be used with confidence.

In distributed systems, particularly where software is to be released into an already live target
environment (such as an operational web site) installation (or deployment as it is sometimes
called) can involve database schema changes as well as the installation of new software.
Deployment plans in such circumstances may include back-out procedures whose use is
intended to roll the target environment back in the event that the deployment is unsuccessful.
Ideally, the deployment plan itself should be tested in an environment that is a replica of the live
environment. A factor that can increase the organisational requirements of such an exercise is
the need to synchronize the data in the test deployment environment with that in the live
environment with minimum disruption to live operation.

Usability testing

Usability testing is a means for measuring how well people can use some human-made object
(such as a web page, a computer interface, a document, or a device) for its intended purpose, i.e.
usability testing measures the usability of the object. Usability testing focuses on a particular
object or a small set of objects, whereas general human-computer interaction studies attempt to
formulate universal principles.

If usability testing uncovers difficulties, such as people having difficulty understanding


instructions, manipulating parts, or interpreting feedback, then developers should improve the
design and test it again. During usability testing, the aim is to observe people using the product in
as realistic a situation as possible, to discover errors and areas of improvement. Designers
commonly focus excessively on creating designs that look "cool", compromising usability and
Jagan Mohan

functionality. This is often caused by pressure from the people in charge, forcing designers to
develop systems based on management expectations instead of people's needs. A designers'
primary function should be more than appearance, including making things work with people.

Caution: simply gathering opinions is not usability testing -- you must arrange an experiment that
measures a subject's ability to use your document. 1

Rather than showing users a rough draft and asking, "Do you understand this?", usability testing
involves watching people trying to use something for its intended purpose. For example, when
testing instructions for assembling a toy, the test subjects should be given the instructions and a
box of parts. Instruction phrasing, illustration quality, and the toy's design all affect the assembly
process.

Setting up a usability test involves carefully creating a scenario, or realistic situation, wherein the
person performs a list of tasks using the product being tested while observers watch and take
notes. Several other test instruments such as scripted instructions, paper prototypes, and pre-
and post-test questionnaires are also used to gather feedback on the product being tested. For
example, to test the attachment function of an e-mail program, a scenario would describe a
situation where a person needs to send an e-mail attachment, and ask him or her to undertake
this task. The aim is to observe how people function in a realistic manner, so that developers can
see problem areas, and what people like. The technique popularly used to gather data during a
usability test is called a think aloud protocol.

What to measure

Usability testing generally involves measuring how well test subjects respond in four areas: time,
accuracy, recall, and emotional response. The results of the first test are the baseline or control
measurement; all subsequent tests are compared to the baseline.

• Time on Task -- How long does it take people to complete basic tasks? (For example,
find something to buy, create a new account, and order the item.)
• Accuracy -- How many mistakes did people make? (And were they fatal or recoverable
with the right information?)
• Recall -- How much does the person remember afterwards?
• Emotional Response -- How does the person feel about the tasks completed?
(Confident? Stressed? Would the user recommend this system to a friend?)

In the early 1990s, Jakob Nielsen, at that time a researcher at Sun Microsystems, popularized the
concept of using numerous small usability tests -- typically with only five test subjects each -- at
Jagan Mohan

various stages of the development process. His argument is that, once found that two or three
people are totally confused by the home page, little is gained by watching more people suffer
through the same flawed design. "Elaborate usability tests are a waste of resources. The best
results come from testing no more than 5 users and running as many small tests as you can
afford." 2. Nielsen subsequently published his research and coined the term heuristic evaluation.

The claim of "Five users is enough" was later described by a mathematical model (Virzi, R.A.,
Refining the Test Phase of Usability Evaluation: How Many Subjects is Enough? Human Factors,
1992. 34(4): p. 457-468.) which states for the proportion of uncovered problems U

n
U = 1 − (1 − p)

where p is the probability of one subject identifying a specific problem and n the number of
subjects (or test sessions). This model shows up as an asymptotic graph towards the number of
real existing problems (see figure below).
Jagan Mohan

User acceptance testing

User Acceptance Testing (UAT) is a process to obtain confirmation by an SME (subject matter
expert), preferably the owner of the object under test, through trial or review, that the modification
or addition meets mutually agreed-upon requirements.

In software development, UAT is one of the final stages of a project and will often occur before a
customer accepts a new system.

Users of the system will perform these tests which, ideally, developers have derived from the
User Requirements Specification, to which the system should conform.

Test designers will draw up a formal test plan and devise a range of severity levels. These tests
are not focused on fleshing out simple problems (spelling mistakes, cosmetic problems) or show
stoppers (major problems like the software crashing, software will not run etc.). Developers
should have worked out these issues during unit testing and integration testing. Rather, the focus
is on a final verification of the required business function and flow of the system. The test scripts
will emulate real-world usage of the system. The idea is that if the software works as intended
and without issues during a simulation of normal use, it will work just the same in production.

Black box testing

Black box testing, concrete box or functional testing is used in computer programming,
software engineering and software testing to check that the outputs of a program, given certain
inputs, conform to the functional specification of the program.

The term black box indicates that the internal implementation of the program being executed is
not examined by the tester. For this reason black box testing is not normally carried out by the
programmer. In most real-world engineering firms, one group does design work while a separate
group does the testing.

A complementary technique, white box testing or structural testing, uses information about the
structure of the program to check that it performs correctly.

Equivalence partitioning
Jagan Mohan

A technique in black box testing is equivalence partitioning. Equivalence partitioning is


designed to minimize the number of test cases by dividing tests in such a way that the system is
expected to act the same way for all tests of each equivalence partition. Test inputs would be
selected from each partition.

Equivalence partitions are designed so that every possible input belongs to one and only one
equivalence partition.

Disadvantages

• Doesn't test every input


• No guidelines for choosing inputs
• Heuristic based
• very limited focus (by Rahul Saxena,IMT)

Boundary value analysis

Boundary value analysis is a technique of black box testing in which input values at the
boundaries of the input domain are tested. It has been widely recognized that input values at the
extreme ends of, and just outside of, input domains tend to cause errors in system functionality.

In boundary value analysis, values at and just beyond the boundaries of the input domain are
used to generate test cases to ensure proper functionality of the system.

As an example, for a system that accepts as input a number between one and ten, boundary
value analysis would indicate that test cases should be created for the lower and upper bounds of
the input domain (1, 10), and values just outside these bounds (0, 11) to ensure proper
functionality.

Boundary value analysis is an excellent way to catch common user input errors which can disrupt
proper program functionality.

Boundary value analysis complements the technique of equivalence partitioning .

Some of the advantages of boundary value analysis are:

• Very good at exposing potential user interface/user input problems


• Very clear guidelines on determining test cases
• Very small set of test cases generated
Jagan Mohan

Disadvantages to boundary value analysis:

• Does not test all possible inputs


• Does not test dependencies between combinations of inputs

Smoke testing

A sub-set of the black box test is the smoke test. A smoke test is a cursory examination of all of
the basic components of a software system to ensure that they work. Typically, smoke testing is
conducted immediately after a software build is made. The term comes from electrical
engineering, where in order to test electronic equipment, power is applied and the tester ensures
that the product does not spark or smoke.

White box testing

White box testing, glass box testing or structural testing is used in computer programming,
software engineering and software testing to check that the outputs of a program, given certain
inputs, conform to the structural specification of the program.

The term white box (or glass box) indicates that testing is done with a knowledge of the code
used to execute certain functionality. For this reason, a programmer is usually required to perform
white box tests. Often, multiple programmers will write tests based on certain code, so as to gain
varying perspectives on possible outcomes.

A complementary technique, black box testing or functional testing, performs testing based on
previously understood requirements (or understood functionality), without knowledge of how the
code executes.

Glass Box Testing

Black-box and glass-box are test design methods. Black-box test design treats the system as a
"black-box", so it doesn't explicitly use knowledge of the internal structure. Black-box test design
is usually described as focusing on testing functional requirements. Glass-box test design allows
Jagan Mohan

one to peek inside the "box", and it focuses specifically on using internal knowledge of the
software to guide the selection of test data.

Glass box testing requires the intimate knowledge of program internals, while black box testing is
based solely on the knowledge of the system requirements. Being primarily concerned with
program internals, it is obvious in SE literature that the primary effort to develop a testing
methodology has been devoted to glass box tests. However, since the importance of black box
testing has gained general acknowledgement, also a certain number of useful black box testing
techniques were developed.

It is important to understand that these methods are used during the test design phase, and their
influence is hard to see in the tests once they're implemented.

Glass box testing definition

Software testing approaches that examine the program structure and derive test data from the
program logic. Structural testing is sometimes referred to as clear-box testing since white boxes
are considered opaque and do not really permit visibility into the code.

Synonyms for Glass box testing

o White Box testing

o Structural testing

o Clear Box testing

o Open Box Testing

Types of Glass Box testing

o Static and Dynamic Analysis static analysis techniques do not necessitate the
execution of the software, dynamic analysis is what is generally considered as
``testing``, i.e. it involves running the system.

o Statement Coverage Testing performed where every statements is executed at


least once.

o Branch Coverage Running a series of tests to ensure that all branches are tested
at least once.
Jagan Mohan

o Path Coverage Testing all paths.

o All-definition-use-path Coverage All paths between the definition of a variable


and the use of that definition are now identified and tested.

Tradeoffs of Glass box testing

Advantages

• forces test developer to reason carefully about implementation

o approximates the partitioning done by execution equivalence

o reveals errors in "hidden" code:

o beneficent side-effects

o optimizations (e.g. charTable that changes reps when size > 100)

Disadvantages

o expensive

o miss cases omitted in the code

Fuzz testing

Fuzz testing is a software testing technique. The basic idea is to attach the inputs of a program
to a source of random data. If the program fails (for example, by crashing, or by failing built-in
code assertions), then there are defects to correct.

The great advantage of fuzz testing is that the test design is extremely simple, and free of
preconceptions about system behavior.

Uses
Jagan Mohan

Fuzz testing is often used in large software development projects that perform black box testing.
These usually have a budget to develop test tools, and fuzz testing is one of the techniques which
offers a high benefit to cost ratio.

Fuzz testing is also used as a gross measurement of a large software system's quality. The
advantage here is that the cost of generating the tests is relatively low. For example, third party
testers have used fuzz testing to evaluate the relative merits of different operating systems and
application programs.

Fuzz testing is thought to enhance software security and software safety because it often finds
odd oversights and defects which human testers would fail to find, and even careful human test
designers would fail to create tests for.

However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only
provide a random sample of the system's behavior, and in many cases passing a fuzz test may
only demonstrate that a piece of software handles exceptions without crashing, rather than
behaving correctly. Thus, fuzz testing can only be regarded as a proxy for program correctness,
rather than a direct measure, with fuzz test failures actually being more useful as a bug-finding
tool than fuzz test passes as an assurance of quality.

Fuzz testing was originated at the University of Wisconsin in 1989 by Professor Barton Miller and
the students in his graduated Advanced Operating Systems class. Their work can be found at
http://www.cs.wisc.edu/~bart/fuzz/.

Fuzz testing methods

As a practical matter, developers need to reproduce errors in order to fix them. For this reason,
almost all fuzz testing makes a record of the data it manufactures, usually before applying it to the
software, so that if the computer fails dramatically, the test data is preserved.

Modern software has several different types of inputs:

• Event driven inputs are usually from a graphical user interface, or possibly from a
mechanism in an embedded system.
• Character driven inputs are from files, or data streams.
• Database inputs are from tabular data, such as relational databases.

There are at least two different forms of fuzz testing:


Jagan Mohan

• Valid fuzz attempts to assure that the random input is reasonable, or conforms to actual
production data.
• Simple fuzz usually uses a pseudo random number generator to provide input.
• A combined approach uses valid test data with some proportion of totally random input
injected.

By using all of these techniques in combination, fuzz-generated randomness can test the un-
designed behavior surrounding a wider range of designed system states.

Fuzz testing may use tools to simulate all of these domains.

Event-driven fuzz

Normally this is provided as a queue of datastructures. The queue is filled with data structures
that have random values.

The most common problem with an event-driven program is that it will often simply use the data
in the queue, without even crude validation. To succeed in a fuzz-tested environment, software
must validate all fields of every queue entry, decode every possible binary value, and then ignore
impossible requests.

One of the more interesting issues with real-time event handling is that if error reporting is too
verbose, simply providing error status can cause resource problems or a crash. Robust error
detection systems will report only the most significant, or most recent error over a period of time.

Character-driven fuzz

Normally this is provided as a stream of random data. The classic source in UNIX is the random
data generator.

One common problem with a character driven program is a buffer overrun, when the character
data exceeds the available buffer space. This problem tends to recur in every instance in which a
string or number is parsed from the data stream and placed in a limited-size area.

Another is that decode tables or logic may be incomplete, not handling every possible binary
value.

Database fuzz
Jagan Mohan

The standard database scheme is usually filled with fuzz that is random data of random sizes.
Some IT shops use software tools to migrate and manipulate such databases. Often the same
schema descriptions can be used to automatically generate fuzz databases.

Database fuzz is controversial, because input and comparison constraints reduce the invalid data
in a database. However, often the database is more tolerant of odd data than its client software,
and a general-purpose interface is available to users. Since major customer and enterprise
management software is starting to be open-source, database-based security attacks are
becoming more credible.

A common problem with fuzz databases is buffer overrun. A common data dictionary, with some
form of automated enforcement is quite helpful and entirely possible. To enforce this, normally all
the database clients need to be recompiled and retested at the same time. Another common
problem is that database clients may not understand the binary possibilities of the database field
type, or, legacy software might have been ported to a new database system with different
possible binary values. A normal, inexpensive solution is to have each program validate database
inputs in the same fashion as user inputs. The normal way to achieve this is to periodically
"clean" production databases with automated verifiers.

Вам также может понравиться