Unit Testing

Theano relies heavily on unit testing. Its importance cannot bestressed enough!

Unit Testing revolves around the following principles:

  • ensuring correctness: making sure that your Op, Type or Optimizationworks in the way you intended it to work. It is important for thistesting to be as thorough as possible: test not only the obviouscases, but more importantly the corner cases which are more likelyto trigger bugs down the line.
  • test all possible failure paths. This means testing that your codefails in the appropriate manner, by raising the correct errors whenin certain situations.
  • sanity check: making sure that everything still runs after you’vedone your modification. If your changes cause unit tests to startfailing, it could be that you’ve changed an API on which other usersrely on. It is therefore your responsibility to either a) providethe fix or b) inform the author of your changes and coordinate withthat person to produce a fix. If this sounds like too much of aburden… then good! APIs aren’t meant to be changed on a whim!

This page is in no way meant to replace tutorials on Python’s unittestmodule, for this we refer the reader to the official documentation. We will howeveradress certain specificities about how unittests relate to theano.

Unittest Primer

A unittest is a subclass of unittest.TestCase, with memberfunctions with names that start with the string test. Forexample:

  1. import unittest
  2.  
  3. class MyTestCase(unittest.TestCase):
  4. def test0(self):
  5. pass
  6. # test passes cleanly
  7.  
  8. def test1(self):
  9. self.assertTrue(2+2 == 5)
  10. # raises an exception, causes test to fail
  11.  
  12. def test2(self):
  13. assert 2+2 == 5
  14. # causes error in test (basically a failure, but counted separately)
  15.  
  16. def test2(self):
  17. assert 2+2 == 4
  18. # this test has the same name as a previous one,
  19. # so this is the one that runs.

How to Run Unit Tests ?

Two options are available:

theano-nose

The easiest by far is to use theano-nose which is a command lineutility that recurses through a given directory, finds all unittestsmatching a specific criteria and executes them. By default, it willfind & execute tests case in test*.py files whose method name startswith ‘test’.

theano-nose is a wrapper around nosetests. You should beable to execute it if you installed Theano using pip, or if you ran“python setup.py develop” after the installation. If theano-nose isnot found by your shell, you will need to add Theano/bin to yourPATH environment variable.

Note

In Theano versions <= 0.5, theano-nose was not included. If youare working with such a version, you can call nosetests insteadof theano-nose in all the examples below.

Running all unit tests

  1. cd Theano/
  2. theano-nose

Running unit tests with standard out

  1. theano-nose -s

Running unit tests contained in a specific .py file

  1. theano-nose <filename>.py

Running a specific unit test

  1. theano-nose <filename>.py:<classname>.<method_name>

Using unittest module

To launch tests cases from within python, you can also use thefunctionality offered by the unittest module. The simplest thingis to run all the tests in a file using unittest.main(). Python’sbuilt-in unittest module uses metaclasses to know about all theunittest.TestCase classes you have created. This call will runthem all, printing ‘.’ for passed tests, and a stack trace forexceptions. The standard footer code in theano’s test files is:

  1. if __name__ == '__main__':
  2. unittest.main()

You can also choose to run a subset of the full test suite.

To run all the tests in one or more TestCase subclasses:

  1. suite = unittest.TestLoader()
  2. suite = suite.loadTestsFromTestCase(MyTestCase0)
  3. suite = suite.loadTestsFromTestCase(MyTestCase1)
  4. ...
  5. unittest.TextTestRunner(verbosity=2).run(suite)

To run just a single MyTestCase member test function called test0:

  1. MyTestCase('test0').debug()

Folder Layout

“tests” directories are scattered throughout theano. Each testssubfolder is meant to contain the unittests which validate the .pyfiles in the parent folder.

Files containing unittests should be prefixed with the word “test”.

Optimally every python module should have a unittest file associatedwith it, as shown below. Unittests testing functionality of module<module>.py should therefore be stored in tests/test_<module>.py:

  1. Theano/theano/tensor/basic.py
  2. Theano/theano/tensor/elemwise.py
  3. Theano/theano/tensor/tests/test_basic.py
  4. Theano/theano/tensor/tests/test_elemwise.py

How to Write a Unittest

Test Cases and Methods

Unittests should be grouped “logically” into test cases, which aremeant to group all unittests operating on the same element and/orconcept. Test cases are implemented as Python classes which inheritfrom unittest.TestCase

Test cases contain multiple test methods. These should be prefixedwith the word “test”.

Test methods should be as specific as possible and cover a particularaspect of the problem. For example, when testing the TensorDot Op, onetest method could check for validity, while another could verify thatthe proper errors are raised when inputs have invalid dimensions.

Test method names should be as explicit as possible, so that users cansee at first glance, what functionality is being tested and what testsneed to be added.

Example:

  1. import unittest
  2.  
  3. class TestTensorDot(unittest.TestCase):
  4. def test_validity(self):
  5. # do stuff
  6. ...
  7. def test_invalid_dims(self):
  8. # do more stuff
  9. ...

Test cases can define a special setUp method, which will get calledbefore each test method is executed. This is a good place to putfunctionality which is shared amongst all test methods in the testcase (i.e initializing data, parameters, seeding random numbergenerators – more on this later)

  1. import unittest
  2.  
  3. class TestTensorDot(unittest.TestCase):
  4. def setUp(self):
  5. # data which will be used in various test methods
  6. self.avals = numpy.array([[1,5,3],[2,4,1]])
  7. self.bvals = numpy.array([[2,3,1,8],[4,2,1,1],[1,4,8,5]])

Similarly, test cases can define a tearDown method, which will beimplicitely called at the end of each test method.

Checking for correctness

When checking for correctness of mathematical expressions, the usershould preferably compare theano’s output to the equivalent numpyimplementation.

Example:

  1. class TestTensorDot(unittest.TestCase):
  2. def setUp(self):
  3. ...
  4.  
  5. def test_validity(self):
  6. a = T.dmatrix('a')
  7. b = T.dmatrix('b')
  8. c = T.dot(a, b)
  9. f = theano.function([a, b], [c])
  10. cmp = f(self.avals, self.bvals) == numpy.dot(self.avals, self.bvals)
  11. self.assertTrue(numpy.all(cmp))

Avoid hard-coding variables, as in the following case:

  1. self.assertTrue(numpy.all(f(self.avals, self.bvals) == numpy.array([[25, 25, 30, 28], [21, 18, 14, 25]])))

This makes the test case less manageable and forces the user to updatethe variables each time the input is changed or possibly when themodule being tested changes (after a bug fix for example). It alsoconstrains the test case to specific input/output data pairs. Thesection on random values covers why this might not be such a goodidea.

Here is a list of useful functions, as defined by TestCase:

  • checking the state of boolean variables: assert,assertTrue, assertFalse
  • checking for (in)equality constraints: assertEqual,assertNotEqual
  • checking for (in)equality constraints up to a given precision (veryuseful in theano): assertAlmostEqual,assertNotAlmostEqual

Checking for errors

On top of verifying that your code provides the correct output, it isequally important to test that it fails in the appropriate manner,raising the appropriate exceptions, etc. Silent failures are deadly,as they can go unnoticed for a long time and a hard to detect“after-the-fact”.

Example:

  1. import unittest
  2.  
  3. class TestTensorDot(unittest.TestCase):
  4. ...
  5. def test_3D_dot_fail(self):
  6. def func():
  7. a = T.TensorType('float64', (False,False,False)) # create 3d tensor
  8. b = T.dmatrix()
  9. c = T.dot(a,b) # we expect this to fail
  10. # above should fail as dot operates on 2D tensors only
  11. self.assertRaises(TypeError, func)

Useful function, as defined by TestCase:

  • assertRaises

Test Cases and Theano Modes

When compiling theano functions or modules, a mode parameter can begiven to specify which linker and optimizer to use.

Example:

  1. from theano import function
  2.  
  3. f = function([a,b],[c],mode='FAST_RUN')

Whenever possible, unit tests should omit this parameter. Leavingout the mode will ensure that unit tests use the default mode.This default mode is set tothe configuration variable config.mode, which defaults to‘FAST_RUN’, and can be set by various mechanisms (see config).

In particular, the enviromnment variable THEANO_FLAGSallows the user to easily switch the mode in which unittests arerun. For example to run all tests in all modes from a BASH script,type this:

  1. THEANO_FLAGS='mode=FAST_COMPILE' theano-nose
  2. THEANO_FLAGS='mode=FAST_RUN' theano-nose
  3. THEANO_FLAGS='mode=DebugMode' theano-nose

Using Random Values in Test Cases

numpy.random is often used in unit tests to initialize large datastructures, for use as inputs to the function or module beingtested. When doing this, it is imperative that the random numbergenerator be seeded at the be beginning of each unit test. This willensure that unittest behaviour is consistent from one execution toanother (i.e always pass or always fail).

Instead of using numpy.random.seed to do this, we encourage users todo the following:

  1. from theano.tests import unittest_tools
  2.  
  3. class TestTensorDot(unittest.TestCase):
  4. def setUp(self):
  5. unittest_tools.seed_rng()
  6. # OR ... call with an explicit seed
  7. unittest_tools.seed_rng(234234) #use only if really necessary!

The behaviour of seed_rng is as follows:

  • If an explicit seed is given, it will be used for seeding numpy’s rng.
  • If not, it will use config.unittests.rseed (its default value is 666).
  • If config.unittests.rseed is set to “random”, it will seed the rng withNone, which is equivalent to seeding with a random seed.

The main advantage of using unittest_tools.seed_rng is that it allowsus to change the seed used in the unitests, without having to manuallyedit all the files. For example, this allows the nightly build to runtheano-nose repeatedly, changing the seed on every run (hence achievinga higher confidence that the variables are correct), while stillmaking sure unittests are deterministic.

Users who prefer their unittests to be random (when run on their localmachine) can simply set config.unittests.rseed to ‘random’ (seeconfig).

Similarly, to provide a seed to numpy.random.RandomState, simply use:

  1. import numpy
  2.  
  3. rng = numpy.random.RandomState(unittest_tools.fetch_seed())
  4. # OR providing an explicit seed
  5. rng = numpy.random.RandomState(unittest_tools.fetch_seed(1231)) #again not recommended

Note that the ability to change the seed from one nosetest to another,is incompatible with the method of hard-coding the baseline variables(against which we compare the theano outputs). These must then bedetermined “algorithmically”. Although this represents more work, thetest suite will be better because of it.

Creating an Op UnitTest

A few tools have been developed to help automate the development ofunitests for Theano Ops.

Validating the Gradient

The verify_grad function can be used to validate that the gradfunction of your Op is properly implemented. verify_grad is basedon the Finite Difference Method where the derivative of function fat point x is approximated as:

\frac{\partial{f}}{\partial{x}} = lim_{\Delta \rightarrow 0} \frac {f(x+\Delta) - f(x-\Delta)} {2\Delta}

verify_grad performs the following steps:

  • approximates the gradient numerically using the Finite Difference Method
  • calculate the gradient using the symbolic expression provided in thegrad function
  • compares the two values. The tests passes if they are equal towithin a certain tolerance.

Here is the prototype for the verify_grad function.

  1. def verify_grad(fun, pt, n_tests=2, rng=None, eps=1.0e-7, abs_tol=0.0001, rel_tol=0.0001):

verify_grad raises an Exception if the difference between the analytic gradient andnumerical gradient (computed through the Finite Difference Method) of a randomprojection of the fun’s output to a scalar exceedsboth the given absolute and relative tolerances.

The parameters are as follows:

  • fun: a Python function that takes Theano variables as inputs,and returns a Theano variable.For instance, an Op instance with a single output is such a function.It can also be a Python function that calls an op with some of itsinputs being fixed to specific values, or that combine multiple ops.
  • pt: the list of numpy.ndarrays to use as input values
  • n_tests: number of times to run the test
  • rng: random number generator used to generate a random vector u,we check the gradient of sum(u*fn) at pt
  • eps: stepsize used in the Finite Difference Method
  • abs_tol: absolute tolerance used as threshold for gradient comparison
  • rel_tol: relative tolerance used as threshold for gradient comparison

In the general case, you can define fun as you want, as long as ittakes as inputs Theano symbolic variables and returns a sinble Theanosymbolic variable:

  1. def test_verify_exprgrad():
  2. def fun(x,y,z):
  3. return (x + tensor.cos(y)) / (4 * z)**2
  4.  
  5. x_val = numpy.asarray([[1], [1.1], [1.2]])
  6. y_val = numpy.asarray([0.1, 0.2])
  7. z_val = numpy.asarray(2)
  8. rng = numpy.random.RandomState(42)
  9.  
  10. tensor.verify_grad(fun, [x_val, y_val, z_val], rng=rng)

Here is an example showing how to use verify_grad on an Op instance:

  1. def test_flatten_outdimNone():
  2. # Testing gradient w.r.t. all inputs of an op (in this example the op
  3. # being used is Flatten(), which takes a single input).
  4. a_val = numpy.asarray([[0,1,2],[3,4,5]], dtype='float64')
  5. rng = numpy.random.RandomState(42)
  6. tensor.verify_grad(tensor.Flatten(), [a_val], rng=rng)

Here is another example, showing how to verify the gradient w.r.t. a subset ofan Op’s inputs. This is useful in particular when the gradient w.r.t. some ofthe inputs cannot be computed by finite difference (e.g. for discrete inputs),which would cause verify_grad to crash.

  1. def test_crossentropy_softmax_grad():
  2. op = tensor.nnet.crossentropy_softmax_argmax_1hot_with_bias
  3. def op_with_fixed_y_idx(x, b):
  4. # Input `y_idx` of this Op takes integer values, so we fix them
  5. # to some constant array.
  6. # Although this op has multiple outputs, we can return only one.
  7. # Here, we return the first output only.
  8. return op(x, b, y_idx=numpy.asarray([0, 2]))[0]
  9.  
  10. x_val = numpy.asarray([[-1, 0, 1], [3, 2, 1]], dtype='float64')
  11. b_val = numpy.asarray([1, 2, 3], dtype='float64')
  12. rng = numpy.random.RandomState(42)
  13.  
  14. tensor.verify_grad(op_with_fixed_y_idx, [x_val, b_val], rng=rng)

Note

Although verify_grad is defined in theano.tensor.basic, unittestsshould use the version of verify_grad defined in theano.tests.unittest_tools.This is simply a wrapper function which takes care of seeding the randomnumber generator appropriately before calling theano.tensor.basic.verify_grad

makeTester and makeBroadcastTester

Most Op unittests perform the same function. All such tests mustverify that the op generates the proper output, that the gradient isvalid, that the Op fails in known/expected ways. Because so much ofthis is common, two helper functions exists to make your lives easier:makeTester and makeBroadcastTester (defined in moduletheano.tensor.tests.test_basic).

Here is an example of makeTester generating testcases for the Dotproduct op:

  1. from numpy import dot
  2. from numpy.random import rand
  3.  
  4. from theano.tensor.tests.test_basic import makeTester
  5.  
  6. DotTester = makeTester(name = 'DotTester',
  7. op = dot,
  8. expected = lambda x, y: numpy.dot(x, y),
  9. checks = {},
  10. good = dict(correct1 = (rand(5, 7), rand(7, 5)),
  11. correct2 = (rand(5, 7), rand(7, 9)),
  12. correct3 = (rand(5, 7), rand(7))),
  13. bad_build = dict(),
  14. bad_runtime = dict(bad1 = (rand(5, 7), rand(5, 7)),
  15. bad2 = (rand(5, 7), rand(8,3))),
  16. grad = dict())

In the above example, we provide a name and a reference to the op wewant to test. We then provide in the expected field, a functionwhich makeTester can use to compute the correct values. Thefollowing five parameters are dictionaries which contain:

  • checks: dictionary of validation functions (dictionary key is adescription of what each function does). Each function accepts twoparameters and performs some sort of validation check on eachop-input/op-output value pairs. If the function returns False, anException is raised containing the check’s description.
  • good: contains valid input values, for which the output should matchthe expected output. Unittest will fail if this is not the case.
  • bad_build: invalid parameters which should generate an Exceptionwhen attempting to build the graph (call to make_node shouldfail). Fails unless an Exception is raised.
  • bad_runtime: invalid parameters which should generate an Exceptionat runtime, when trying to compute the actual output values (call toperform should fail). Fails unless an Exception is raised.
  • grad: dictionary containing input values which will be used in thecall to verify_grad

makeBroadcastTester is a wrapper function for makeTester. If aninplace=True parameter is passed to it, it will take care ofadding an entry to the checks dictionary. This check will ensurethat inputs and outputs are equal, after the Op’s perform function hasbeen applied.