Implementing some specific Ops

Implementing some specific Ops

This page is a guide on the implementation of some specific types of Ops,and points to some examples of such implementations.

For the random number generating Ops, it explains different possibleimplementation strategies.

Scalar/Elemwise/Reduction Ops

Implementing a Theano scalar Op allows that scalar operation to be reusedby our elemwise operations on tensors. If the scalar operation has C code, theelemwise implementation will automatically have C code too. Thiswill enable the fusion of elemwise operations using your new scalaroperation. It can also reuse the GPU elemwise code. It is similar forreduction operations.

For examples of how to add new scalar operations, you can have a look atthose 2 pull requests, that add GammaLn and Psi and Gamma scalar Ops.

Be careful about some possible problems in the definition of thegrad method, and about dependencies that may not be available. Inparticular, see the following fixes:Fix to grad() methodsand impl() methods related to SciPy.

SciPy Ops

We can wrap SciPy functions in Theano. But SciPy is an optional dependency.Here is some code that allows the Op to be optional:

try:
    import scipy.linalg
    imported_scipy = True
except ImportError:
    # some ops (e.g. Cholesky, Solve, A_Xinv_b) won't work
    imported_scipy = False
 
class SomeOp(Op):
    ...
    def make_node(self, x):
        assert imported_scipy, (
        "SciPy not available. SciPy is needed for the SomeOp op.")
        ...
 
from nose.plugins.skip import SkipTest
class test_SomeOp(utt.InferShapeTester):
    ...
    def test_infer_shape(self):
        if not imported_scipy:
            raise SkipTest("SciPy needed for the SomeOp op.")
        ...

Sparse Ops

There are a few differences to keep in mind if you want to make an opthat uses sparse inputs or outputs, rather than theusual dense tensors. In particular, in themake_node() function, you have to calltheano.sparse.as_sparse_variable(x) on sparse input variables,instead of as_tensor_variable(x).

Another difference is that you need to use SparseVariable andSparseType instead of TensorVariable and TensorType.

Do not forget that we support only sparse matrices (so only 2 dimensions)and (like in SciPy) they do not support broadcasting operations by default(although a few Ops do it when called manually). Also, we support only twoformats for sparse type: csr and csc. So in make_mode(),you can create output variables like this:

out_format = inputs[0].format  # or 'csr' or 'csc' if the output format is fixed
SparseType(dtype=inputs[0].dtype, format=out_format).make_variable()

See the sparse theano.sparse.basic.Cast op codefor a good example of a sparse op with Python code.

Note

From the definition of CSR and CSC formats, CSR column indices arenot necessarily sorted. Likewise for CSC row indices. UseEnsureSortedIndices if your code does notsupport it.

Also, there can be explicit zeros in your inputs. UseRemove0 or remove0 tomake sure they aren’t present in your input if you don’t supportthat.

To remove explicit zeros and make sure indices are sorted, useclean.

Sparse Gradient

There are 2 types of gradients for sparseoperations: normalgradient and structured gradient. Please document what your opimplements in its docstring. It is important that the user knows it, andit is not always easy to infer from the code. Also make clear whichinputs/outputs are sparse and which ones are dense.

Sparse C code

Theano does not have a native C code interface for sparse matrices. Thereason is simple: we use the SciPy sparse matrix objects and they don’thave a C object. So we use a simple trick: a sparse matrix is made of4 fields that are NumPy vector arrays: data, indices, indptrand shape. So to makean op with C code that has sparse variables as inputs, we actually make an opthat takes as input the needed fields of those sparse variables.

You can extract the 4 fields withtheano.sparse.basic.csm_properties(). You can usetheano.sparse.basic.csm_data(),theano.sparse.basic.csm_indices(),theano.sparse.basic.csm_indptr() andtheano.sparse.basic.csm_shape() to extract the individualfields.

You can look at the AddSDsparse op for an example with C code. It implements the addition of asparse matrix with a dense matrix.

Sparse Tests

You can reuse the test system for tensor variables. To generate theneeded sparse variable and data, you can usetheano.sparse.tests.test_basic.sparse_random_inputs(). It takesmany parameters, including parameters for the format (csr or csc), the shape, thedtype, whether to have explicit 0 and whether to have unsorted indices.

Random distribution

We have 3 base random number generators. One that wraps NumPy’s randomgenerator, one that implements MRG31k3p and one that wraps CURAND.

The fastest, but less developed, is CURAND. It works only on CUDA-enabledGPUs. It does not work on the CPU and it has fewer random distributionsimplemented.

The recommended and 2nd faster is MRG. It works on the GPU and CPU andhas more implemented distributions.

The slowest is our wrapper on NumPy’s random generator.

We explain and provide advice on 3 possibles implementations of newdistributions here:

Extend our wrapper around NumPy random functions.See this PR as an example.
Extend MRG implementation by reusing existing Theano Op. Look intothe theano/sandbox/rng_mrg.py file and grep for all code aboutbinomial(). This distribution uses the output of the uniformdistribution and converts it to a binomial distribution withexisting Theano operations. The tests go intheano/sandbox/test_rng_mrg.py
Extend MRG implementation with a new Op that takes a uniform sample asinput. Look in the theano/sandbox/{rng_mrg,multinomial}.py fileand its test in theano/sandbox/test_multinomal.py. This isrecommended when current Theano ops aren’t well suited to modifythe uniform to the target distribution. This can happen inparticular if there is a loop or complicated condition.

Note

In all cases, you must reuse the same interface as NumPy for compatibility.

OpenMP Ops

To allow consistent interface of Ops that support OpenMP, we have somehelper code. Doing this also allows to enable/disable OpenMP globallyor per op for fine-grained control.

Your Op needs to inherit from theano.gof.OpenMPOp. If it overridesthe init() method, it must have an openmp=None parameterand must call super(MyOpClass, self).init(openmp=openmp).

The OpenMPOp class also implements c_compile_args andmake_thunk. This makes it add the correct g++ flags to compile withOpenMP. It also disables OpenMP and prints a warning if the version ofg++ does not support it.

The Theano flag openmp is currently False by default as we do nothave code that gets sped up with it. The only current implementationis ConvOp. It speeds up some cases, but slows down others. That is whywe disable it by default. But we have all the code to have it enabledby default if there is more than 1 core and the environmentvariable OMP_NUM_THREADS is not 1. This allows Theano to respect thecurrent convention.

Numba Ops

Want C speed without writing C code for your new Op? You can use Numbato generate the C code for you! Here is an exampleOp doing that.

Alternate Theano Types

Most ops in Theano are used to manipulate tensors. However, Theano alsosupports many other variable types. The supported types are listed below,along with pointers to the relevant documentation.

TensorType : Theano type that representsa multidimensional array containing elements that all have the sametype. Variables of this Theano type are represented in C as objects ofclassPyArrayObject.
TypedList : Theano type that represents atyped list (a list where every element in the list has the same Theanotype). Variables of this Theano type are represented in C as objectsof class PyListObject.
Scalar : Theano type that represents a Cprimitive type. The C type associated with this Theano type is therepresented C primitive itself.
SparseType : Theano type used to represent sparsetensors. There is no equivalent C type for this Theano Type but youcan split a sparse variable into its parts as TensorVariables. Thosecan then be used as inputs to an op with C code.
Generic : Theano type thatrepresents a simple Python Object. Variables of this Theano type arerepresented in C as objects of class PyObject.
CDataType : Theano type thatrepresents a C data type. The C type associated with this Theano typedepends on the data being represented.