423 lines
18 KiB
Plaintext
423 lines
18 KiB
Plaintext
|
+++++++++++++++++++++++++++++++
|
||
|
Writing Python Regression Tests
|
||
|
+++++++++++++++++++++++++++++++
|
||
|
|
||
|
:Author: Skip Montanaro
|
||
|
:Contact: skip@mojam.com
|
||
|
|
||
|
Introduction
|
||
|
============
|
||
|
|
||
|
If you add a new module to Python or modify the functionality of an existing
|
||
|
module, you should write one or more test cases to exercise that new
|
||
|
functionality. There are different ways to do this within the regression
|
||
|
testing facility provided with Python; any particular test should use only
|
||
|
one of these options. Each option requires writing a test module using the
|
||
|
conventions of the selected option:
|
||
|
|
||
|
- unittest_ based tests
|
||
|
- doctest_ based tests
|
||
|
- "traditional" Python test modules
|
||
|
|
||
|
Regardless of the mechanics of the testing approach you choose,
|
||
|
you will be writing unit tests (isolated tests of functions and objects
|
||
|
defined by the module) using white box techniques. Unlike black box
|
||
|
testing, where you only have the external interfaces to guide your test case
|
||
|
writing, in white box testing you can see the code being tested and tailor
|
||
|
your test cases to exercise it more completely. In particular, you will be
|
||
|
able to refer to the C and Python code in the CVS repository when writing
|
||
|
your regression test cases.
|
||
|
|
||
|
.. _unittest: http://www.python.org/doc/current/lib/module-unittest.html
|
||
|
.. _doctest: http://www.python.org/doc/current/lib/module-doctest.html
|
||
|
|
||
|
unittest-based tests
|
||
|
------------------
|
||
|
The unittest_ framework is based on the ideas of unit testing as espoused
|
||
|
by Kent Beck and the `Extreme Programming`_ (XP) movement. The specific
|
||
|
interface provided by the framework is tightly based on the JUnit_
|
||
|
Java implementation of Beck's original SmallTalk test framework. Please
|
||
|
see the documentation of the unittest_ module for detailed information on
|
||
|
the interface and general guidelines on writing unittest-based tests.
|
||
|
|
||
|
The test_support helper module provides two functions for use by
|
||
|
unittest-based tests in the Python regression testing framework:
|
||
|
|
||
|
- ``run_unittest()`` takes a number of ``unittest.TestCase`` derived class as
|
||
|
parameters and runs the tests defined in those classes.
|
||
|
|
||
|
- ``run_suite()`` takes a populated ``TestSuite`` instance and runs the
|
||
|
tests.
|
||
|
|
||
|
``run_suite()`` is preferred because unittest files typically grow multiple
|
||
|
test classes, and you might as well be prepared.
|
||
|
|
||
|
All test methods in the Python regression framework have names that
|
||
|
start with "``test_``" and use lower-case names with words separated with
|
||
|
underscores.
|
||
|
|
||
|
Test methods should *not* have docstrings! The unittest module prints
|
||
|
the docstring if there is one, but otherwise prints the function name
|
||
|
and the full class name. When there's a problem with a test, the
|
||
|
latter information makes it easier to find the source for the test
|
||
|
than the docstring.
|
||
|
|
||
|
All unittest-based tests in the Python test suite use boilerplate that
|
||
|
looks like this (with minor variations)::
|
||
|
|
||
|
import unittest
|
||
|
from test import test_support
|
||
|
|
||
|
class MyTestCase1(unittest.TestCase):
|
||
|
|
||
|
# Define setUp and tearDown only if needed
|
||
|
|
||
|
def setUp(self):
|
||
|
unittest.TestCase.setUp(self)
|
||
|
... additional initialization...
|
||
|
|
||
|
def tearDown(self):
|
||
|
... additional finalization...
|
||
|
unittest.TestCase.tearDown(self)
|
||
|
|
||
|
def test_feature_one(self):
|
||
|
# Testing feature one
|
||
|
...unit test for feature one...
|
||
|
|
||
|
def test_feature_two(self):
|
||
|
# Testing feature two
|
||
|
...unit test for feature two...
|
||
|
|
||
|
...etc...
|
||
|
|
||
|
class MyTestCase2(unittest.TestCase):
|
||
|
...same structure as MyTestCase1...
|
||
|
|
||
|
...etc...
|
||
|
|
||
|
def test_main():
|
||
|
suite = unittest.TestSuite()
|
||
|
suite.addTest(unittest.makeSuite(MyTestCase1))
|
||
|
suite.addTest(unittest.makeSuite(MyTestCase2))
|
||
|
...add more suites...
|
||
|
test_support.run_suite(suite)
|
||
|
|
||
|
if __name__ == "__main__":
|
||
|
test_main()
|
||
|
|
||
|
This has the advantage that it allows the unittest module to be used
|
||
|
as a script to run individual tests as well as working well with the
|
||
|
regrtest framework.
|
||
|
|
||
|
.. _Extreme Programming: http://www.extremeprogramming.org/
|
||
|
.. _JUnit: http://www.junit.org/
|
||
|
|
||
|
doctest based tests
|
||
|
-------------------
|
||
|
Tests written to use doctest_ are actually part of the docstrings for
|
||
|
the module being tested. Each test is written as a display of an
|
||
|
interactive session, including the Python prompts, statements that would
|
||
|
be typed by the user, and the output of those statements (including
|
||
|
tracebacks, although only the exception msg needs to be retained then).
|
||
|
The module in the test package is simply a wrapper that causes doctest
|
||
|
to run over the tests in the module. The test for the difflib module
|
||
|
provides a convenient example::
|
||
|
|
||
|
import difflib
|
||
|
from test import test_support
|
||
|
test_support.run_doctest(difflib)
|
||
|
|
||
|
If the test is successful, nothing is written to stdout (so you should not
|
||
|
create a corresponding output/test_difflib file), but running regrtest
|
||
|
with -v will give a detailed report, the same as if passing -v to doctest.
|
||
|
|
||
|
A second argument can be passed to run_doctest to tell doctest to search
|
||
|
``sys.argv`` for -v instead of using test_support's idea of verbosity. This
|
||
|
is useful for writing doctest-based tests that aren't simply running a
|
||
|
doctest'ed Lib module, but contain the doctests themselves. Then at
|
||
|
times you may want to run such a test directly as a doctest, independent
|
||
|
of the regrtest framework. The tail end of test_descrtut.py is a good
|
||
|
example::
|
||
|
|
||
|
def test_main(verbose=None):
|
||
|
from test import test_support, test_descrtut
|
||
|
test_support.run_doctest(test_descrtut, verbose)
|
||
|
|
||
|
if __name__ == "__main__":
|
||
|
test_main(1)
|
||
|
|
||
|
If run via regrtest, ``test_main()`` is called (by regrtest) without
|
||
|
specifying verbose, and then test_support's idea of verbosity is used. But
|
||
|
when run directly, ``test_main(1)`` is called, and then doctest's idea of
|
||
|
verbosity is used.
|
||
|
|
||
|
See the documentation for the doctest module for information on
|
||
|
writing tests using the doctest framework.
|
||
|
|
||
|
"traditional" Python test modules
|
||
|
---------------------------------
|
||
|
The mechanics of how the "traditional" test system operates are fairly
|
||
|
straightforward. When a test case is run, the output is compared with the
|
||
|
expected output that is stored in .../Lib/test/output. If the test runs to
|
||
|
completion and the actual and expected outputs match, the test succeeds, if
|
||
|
not, it fails. If an ``ImportError`` or ``test_support.TestSkipped`` error
|
||
|
is raised, the test is not run.
|
||
|
|
||
|
Executing Test Cases
|
||
|
====================
|
||
|
If you are writing test cases for module spam, you need to create a file
|
||
|
in .../Lib/test named test_spam.py. In addition, if the tests are expected
|
||
|
to write to stdout during a successful run, you also need to create an
|
||
|
expected output file in .../Lib/test/output named test_spam ("..."
|
||
|
represents the top-level directory in the Python source tree, the directory
|
||
|
containing the configure script). If needed, generate the initial version
|
||
|
of the test output file by executing::
|
||
|
|
||
|
./python Lib/test/regrtest.py -g test_spam.py
|
||
|
|
||
|
from the top-level directory.
|
||
|
|
||
|
Any time you modify test_spam.py you need to generate a new expected
|
||
|
output file. Don't forget to desk check the generated output to make sure
|
||
|
it's really what you expected to find! All in all it's usually better
|
||
|
not to have an expected-out file (note that doctest- and unittest-based
|
||
|
tests do not).
|
||
|
|
||
|
To run a single test after modifying a module, simply run regrtest.py
|
||
|
without the -g flag::
|
||
|
|
||
|
./python Lib/test/regrtest.py test_spam.py
|
||
|
|
||
|
While debugging a regression test, you can of course execute it
|
||
|
independently of the regression testing framework and see what it prints::
|
||
|
|
||
|
./python Lib/test/test_spam.py
|
||
|
|
||
|
To run the entire test suite:
|
||
|
|
||
|
- [UNIX, + other platforms where "make" works] Make the "test" target at the
|
||
|
top level::
|
||
|
|
||
|
make test
|
||
|
|
||
|
- [WINDOWS] Run rt.bat from your PCBuild directory. Read the comments at
|
||
|
the top of rt.bat for the use of special -d, -O and -q options processed
|
||
|
by rt.bat.
|
||
|
|
||
|
- [OTHER] You can simply execute the two runs of regrtest (optimized and
|
||
|
non-optimized) directly::
|
||
|
|
||
|
./python Lib/test/regrtest.py
|
||
|
./python -O Lib/test/regrtest.py
|
||
|
|
||
|
But note that this way picks up whatever .pyc and .pyo files happen to be
|
||
|
around. The makefile and rt.bat ways run the tests twice, the first time
|
||
|
removing all .pyc and .pyo files from the subtree rooted at Lib/.
|
||
|
|
||
|
Test cases generate output based upon values computed by the test code.
|
||
|
When executed, regrtest.py compares the actual output generated by executing
|
||
|
the test case with the expected output and reports success or failure. It
|
||
|
stands to reason that if the actual and expected outputs are to match, they
|
||
|
must not contain any machine dependencies. This means your test cases
|
||
|
should not print out absolute machine addresses (e.g. the return value of
|
||
|
the id() builtin function) or floating point numbers with large numbers of
|
||
|
significant digits (unless you understand what you are doing!).
|
||
|
|
||
|
|
||
|
Test Case Writing Tips
|
||
|
======================
|
||
|
Writing good test cases is a skilled task and is too complex to discuss in
|
||
|
detail in this short document. Many books have been written on the subject.
|
||
|
I'll show my age by suggesting that Glenford Myers' `"The Art of Software
|
||
|
Testing"`_, published in 1979, is still the best introduction to the subject
|
||
|
available. It is short (177 pages), easy to read, and discusses the major
|
||
|
elements of software testing, though its publication predates the
|
||
|
object-oriented software revolution, so doesn't cover that subject at all.
|
||
|
Unfortunately, it is very expensive (about $100 new). If you can borrow it
|
||
|
or find it used (around $20), I strongly urge you to pick up a copy.
|
||
|
|
||
|
The most important goal when writing test cases is to break things. A test
|
||
|
case that doesn't uncover a bug is much less valuable than one that does.
|
||
|
In designing test cases you should pay attention to the following:
|
||
|
|
||
|
* Your test cases should exercise all the functions and objects defined
|
||
|
in the module, not just the ones meant to be called by users of your
|
||
|
module. This may require you to write test code that uses the module
|
||
|
in ways you don't expect (explicitly calling internal functions, for
|
||
|
example - see test_atexit.py).
|
||
|
|
||
|
* You should consider any boundary values that may tickle exceptional
|
||
|
conditions (e.g. if you were writing regression tests for division,
|
||
|
you might well want to generate tests with numerators and denominators
|
||
|
at the limits of floating point and integer numbers on the machine
|
||
|
performing the tests as well as a denominator of zero).
|
||
|
|
||
|
* You should exercise as many paths through the code as possible. This
|
||
|
may not always be possible, but is a goal to strive for. In
|
||
|
particular, when considering if statements (or their equivalent), you
|
||
|
want to create test cases that exercise both the true and false
|
||
|
branches. For loops, you should create test cases that exercise the
|
||
|
loop zero, one and multiple times.
|
||
|
|
||
|
* You should test with obviously invalid input. If you know that a
|
||
|
function requires an integer input, try calling it with other types of
|
||
|
objects to see how it responds.
|
||
|
|
||
|
* You should test with obviously out-of-range input. If the domain of a
|
||
|
function is only defined for positive integers, try calling it with a
|
||
|
negative integer.
|
||
|
|
||
|
* If you are going to fix a bug that wasn't uncovered by an existing
|
||
|
test, try to write a test case that exposes the bug (preferably before
|
||
|
fixing it).
|
||
|
|
||
|
* If you need to create a temporary file, you can use the filename in
|
||
|
``test_support.TESTFN`` to do so. It is important to remove the file
|
||
|
when done; other tests should be able to use the name without cleaning
|
||
|
up after your test.
|
||
|
|
||
|
.. _"The Art of Software Testing":
|
||
|
http://www.amazon.com/exec/obidos/ISBN=0471043281
|
||
|
|
||
|
Regression Test Writing Rules
|
||
|
=============================
|
||
|
Each test case is different. There is no "standard" form for a Python
|
||
|
regression test case, though there are some general rules (note that
|
||
|
these mostly apply only to the "classic" tests; unittest_- and doctest_-
|
||
|
based tests should follow the conventions natural to those frameworks)::
|
||
|
|
||
|
* If your test case detects a failure, raise ``TestFailed`` (found in
|
||
|
``test.test_support``).
|
||
|
|
||
|
* Import everything you'll need as early as possible.
|
||
|
|
||
|
* If you'll be importing objects from a module that is at least
|
||
|
partially platform-dependent, only import those objects you need for
|
||
|
the current test case to avoid spurious ``ImportError`` exceptions
|
||
|
that prevent the test from running to completion.
|
||
|
|
||
|
* Print all your test case results using the ``print`` statement. For
|
||
|
non-fatal errors, print an error message (or omit a successful
|
||
|
completion print) to indicate the failure, but proceed instead of
|
||
|
raising ``TestFailed``.
|
||
|
|
||
|
* Use ``assert`` sparingly, if at all. It's usually better to just print
|
||
|
what you got, and rely on regrtest's got-vs-expected comparison to
|
||
|
catch deviations from what you expect. ``assert`` statements aren't
|
||
|
executed at all when regrtest is run in -O mode; and, because they
|
||
|
cause the test to stop immediately, can lead to a long & tedious
|
||
|
test-fix, test-fix, test-fix, ... cycle when things are badly broken
|
||
|
(and note that "badly broken" often includes running the test suite
|
||
|
for the first time on new platforms or under new implementations of
|
||
|
the language).
|
||
|
|
||
|
Miscellaneous
|
||
|
=============
|
||
|
There is a test_support module in the test package you can import for
|
||
|
your test case. Import this module using either::
|
||
|
|
||
|
import test.test_support
|
||
|
|
||
|
or::
|
||
|
|
||
|
from test import test_support
|
||
|
|
||
|
test_support provides the following useful objects:
|
||
|
|
||
|
* ``TestFailed`` - raise this exception when your regression test detects
|
||
|
a failure.
|
||
|
|
||
|
* ``TestSkipped`` - raise this if the test could not be run because the
|
||
|
platform doesn't offer all the required facilities (like large
|
||
|
file support), even if all the required modules are available.
|
||
|
|
||
|
* ``ResourceDenied`` - this is raised when a test requires a resource that
|
||
|
is not available. Primarily used by 'requires'.
|
||
|
|
||
|
* ``verbose`` - you can use this variable to control print output. Many
|
||
|
modules use it. Search for "verbose" in the test_*.py files to see
|
||
|
lots of examples.
|
||
|
|
||
|
* ``forget(module_name)`` - attempts to cause Python to "forget" that it
|
||
|
loaded a module and erase any PYC files.
|
||
|
|
||
|
* ``is_resource_enabled(resource)`` - Returns a boolean based on whether
|
||
|
the resource is enabled or not.
|
||
|
|
||
|
* ``requires(resource [, msg])`` - if the required resource is not
|
||
|
available the ResourceDenied exception is raised.
|
||
|
|
||
|
* ``verify(condition, reason='test failed')``. Use this instead of::
|
||
|
|
||
|
assert condition[, reason]
|
||
|
|
||
|
``verify()`` has two advantages over ``assert``: it works even in -O
|
||
|
mode, and it raises ``TestFailed`` on failure instead of
|
||
|
``AssertionError``.
|
||
|
|
||
|
* ``have_unicode`` - true if Unicode is available, false otherwise.
|
||
|
|
||
|
* ``is_jython`` - true if the interpreter is Jython, false otherwise.
|
||
|
|
||
|
* ``TESTFN`` - a string that should always be used as the filename when
|
||
|
you need to create a temp file. Also use ``try``/``finally`` to
|
||
|
ensure that your temp files are deleted before your test completes.
|
||
|
Note that you cannot unlink an open file on all operating systems, so
|
||
|
also be sure to close temp files before trying to unlink them.
|
||
|
|
||
|
* ``sortdict(dict)`` - acts like ``repr(dict.items())``, but sorts the
|
||
|
items first. This is important when printing a dict value, because
|
||
|
the order of items produced by ``dict.items()`` is not defined by the
|
||
|
language.
|
||
|
|
||
|
* ``findfile(file)`` - you can call this function to locate a file
|
||
|
somewhere along sys.path or in the Lib/test tree - see
|
||
|
test_linuxaudiodev.py for an example of its use.
|
||
|
|
||
|
* ``fcmp(x,y)`` - you can call this function to compare two floating
|
||
|
point numbers when you expect them to only be approximately equal
|
||
|
withing a fuzz factor (``test_support.FUZZ``, which defaults to 1e-6).
|
||
|
|
||
|
* ``check_syntax(statement)`` - make sure that the statement is *not*
|
||
|
correct Python syntax.
|
||
|
|
||
|
|
||
|
Python and C statement coverage results are currently available at
|
||
|
|
||
|
http://www.musi-cal.com/~skip/python/Python/dist/src/
|
||
|
|
||
|
As of this writing (July, 2000) these results are being generated nightly.
|
||
|
You can refer to the summaries and the test coverage output files to see
|
||
|
where coverage is adequate or lacking and write test cases to beef up the
|
||
|
coverage.
|
||
|
|
||
|
Some Non-Obvious regrtest Features
|
||
|
==================================
|
||
|
* Automagic test detection: When you create a new test file
|
||
|
test_spam.py, you do not need to modify regrtest (or anything else)
|
||
|
to advertise its existence. regrtest searches for and runs all
|
||
|
modules in the test directory with names of the form test_xxx.py.
|
||
|
|
||
|
* Miranda output: If, when running test_spam.py, regrtest does not
|
||
|
find an expected-output file test/output/test_spam, regrtest
|
||
|
pretends that it did find one, containing the single line
|
||
|
|
||
|
test_spam
|
||
|
|
||
|
This allows new tests that don't expect to print anything to stdout
|
||
|
to not bother creating expected-output files.
|
||
|
|
||
|
* Two-stage testing: To run test_spam.py, regrtest imports test_spam
|
||
|
as a module. Most tests run to completion as a side-effect of
|
||
|
getting imported. After importing test_spam, regrtest also executes
|
||
|
``test_spam.test_main()``, if test_spam has a ``test_main`` attribute.
|
||
|
This is rarely required with the "traditional" Python tests, and
|
||
|
you shouldn't create a module global with name test_main unless
|
||
|
you're specifically exploiting this gimmick. This usage does
|
||
|
prove useful with unittest-based tests as well, however; defining
|
||
|
a ``test_main()`` which is run by regrtest and a script-stub in the
|
||
|
test module ("``if __name__ == '__main__': test_main()``") allows
|
||
|
the test to be used like any other Python test and also work
|
||
|
with the unittest.py-as-a-script approach, allowing a developer
|
||
|
to run specific tests from the command line.
|