Basic Tensor Functionality

Theano supports any kind of Python object, but its focus is support forsymbolic matrix expressions. When you type,

  1. >>> x = T.fmatrix()

the x is a TensorVariable instance.The T.fmatrix object itself is an instance of TensorType.Theano knows what type of variable x is because x.typepoints back to T.fmatrix.

This chapter explains the various ways of creating tensor variables,the attributes and methods of TensorVariable and TensorType,and various basic symbolic math and arithmetic that Theano supports fortensor variables.

Creation

Theano provides a list of predefined tensor types that can be usedto create a tensor variables. Variables can be named to facilitate debugging,and all of these constructors accept an optional name argument.For example, the following each produce a TensorVariable instance that standsfor a 0-dimensional ndarray of integers with the name 'myvar':

  1. >>> x = scalar('myvar', dtype='int32')
  2. >>> x = iscalar('myvar')
  3. >>> x = TensorType(dtype='int32', broadcastable=())('myvar')

Constructors with optional dtype

These are the simplest and often-preferred methods for creating symbolicvariables in your code. By default, they produce floating-point variables(with dtype determined by config.floatX, see floatX) so if you usethese constructors it is easy to switch your code between different levels offloating-point precision.

  • theano.tensor.scalar(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 0-dimensional ndarray
  • theano.tensor.vector(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 1-dimensional ndarray
  • theano.tensor.row(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 2-dimensional ndarrayin which the number of rows is guaranteed to be 1.
  • theano.tensor.col(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 2-dimensional ndarrayin which the number of columns is guaranteed to be 1.
  • theano.tensor.matrix(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 2-dimensional ndarray
  • theano.tensor.tensor3(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 3-dimensional ndarray
  • theano.tensor.tensor4(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 4-dimensional ndarray
  • theano.tensor.tensor5(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 5-dimensional ndarray
  • theano.tensor.tensor6(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 6-dimensional ndarray
  • theano.tensor.tensor7(name=None, dtype=config.floatX)[source]
  • Return a Variable for a 7-dimensional ndarray

All Fully-Typed Constructors

The following TensorType instances are provided in the theano.tensor module.They are all callable, and accept an optional name argument. So for example:

  1. from theano.tensor import *
  2.  
  3. x = dmatrix() # creates one Variable with no name
  4. x = dmatrix('x') # creates one Variable with name 'x'
  5. xyz = dmatrix('xyz') # creates one Variable with name 'xyz'
Constructordtypendimshapebroadcastable
bscalarint80()()
bvectorint81(?,)(False,)
browint82(1,?)(True, False)
bcolint82(?,1)(False, True)
bmatrixint82(?,?)(False, False)
btensor3int83(?,?,?)(False, False, False)
btensor4int84(?,?,?,?)(False, False, False, False)
btensor5int85(?,?,?,?,?)(False, False, False, False, False)
btensor6int86(?,?,?,?,?,?)(False,) 6
btensor7int87(?,?,?,?,?,?,?)(False,) 7
wscalarint160()()
wvectorint161(?,)(False,)
wrowint162(1,?)(True, False)
wcolint162(?,1)(False, True)
wmatrixint162(?,?)(False, False)
wtensor3int163(?,?,?)(False, False, False)
wtensor4int164(?,?,?,?)(False, False, False, False)
wtensor5int165(?,?,?,?,?)(False, False, False, False, False)
wtensor6int166(?,?,?,?,?,?)(False,) 6
wtensor7int167(?,?,?,?,?,?,?)(False,) 7
iscalarint320()()
ivectorint321(?,)(False,)
irowint322(1,?)(True, False)
icolint322(?,1)(False, True)
imatrixint322(?,?)(False, False)
itensor3int323(?,?,?)(False, False, False)
itensor4int324(?,?,?,?)(False, False, False, False)
itensor5int325(?,?,?,?,?)(False, False, False, False, False)
itensor6int326(?,?,?,?,?,?)(False,) 6
itensor7int327(?,?,?,?,?,?,?)(False,) 7
lscalarint640()()
lvectorint641(?,)(False,)
lrowint642(1,?)(True, False)
lcolint642(?,1)(False, True)
lmatrixint642(?,?)(False, False)
ltensor3int643(?,?,?)(False, False, False)
ltensor4int644(?,?,?,?)(False, False, False, False)
ltensor5int645(?,?,?,?,?)(False, False, False, False, False)
ltensor6int646(?,?,?,?,?,?)(False,) 6
ltensor7int647(?,?,?,?,?,?,?)(False,) 7
dscalarfloat640()()
dvectorfloat641(?,)(False,)
drowfloat642(1,?)(True, False)
dcolfloat642(?,1)(False, True)
dmatrixfloat642(?,?)(False, False)
dtensor3float643(?,?,?)(False, False, False)
dtensor4float644(?,?,?,?)(False, False, False, False)
dtensor5float645(?,?,?,?,?)(False, False, False, False, False)
dtensor6float646(?,?,?,?,?,?)(False,) 6
dtensor7float647(?,?,?,?,?,?,?)(False,) 7
fscalarfloat320()()
fvectorfloat321(?,)(False,)
frowfloat322(1,?)(True, False)
fcolfloat322(?,1)(False, True)
fmatrixfloat322(?,?)(False, False)
ftensor3float323(?,?,?)(False, False, False)
ftensor4float324(?,?,?,?)(False, False, False, False)
ftensor5float325(?,?,?,?,?)(False, False, False, False, False)
ftensor6float326(?,?,?,?,?,?)(False,) 6
ftensor7float327(?,?,?,?,?,?,?)(False,) 7
cscalarcomplex640()()
cvectorcomplex641(?,)(False,)
crowcomplex642(1,?)(True, False)
ccolcomplex642(?,1)(False, True)
cmatrixcomplex642(?,?)(False, False)
ctensor3complex643(?,?,?)(False, False, False)
ctensor4complex644(?,?,?,?)(False, False, False, False)
ctensor5complex645(?,?,?,?,?)(False, False, False, False, False)
ctensor6complex646(?,?,?,?,?,?)(False,) 6
ctensor7complex647(?,?,?,?,?,?,?)(False,) 7
zscalarcomplex1280()()
zvectorcomplex1281(?,)(False,)
zrowcomplex1282(1,?)(True, False)
zcolcomplex1282(?,1)(False, True)
zmatrixcomplex1282(?,?)(False, False)
ztensor3complex1283(?,?,?)(False, False, False)
ztensor4complex1284(?,?,?,?)(False, False, False, False)
ztensor5complex1285(?,?,?,?,?)(False, False, False, False, False)
ztensor6complex1286(?,?,?,?,?,?)(False,) 6
ztensor7complex1287(?,?,?,?,?,?,?)(False,) 7

Plural Constructors

There are several constructors that can produce multiple variables at once.These are not frequently used in practice, but often used in tutorial examples to save space!

  • iscalars, lscalars, fscalars, dscalars
  • Return one or more scalar variables.
  • ivectors, lvectors, fvectors, dvectors
  • Return one or more vector variables.
  • irows, lrows, frows, drows
  • Return one or more row variables.
  • icols, lcols, fcols, dcols
  • Return one or more col variables.
  • imatrices, lmatrices, fmatrices, dmatrices
  • Return one or more matrix variables.

Each of these plural constructors acceptsan integer or several strings. If an integer is provided, the methodwill return that many Variables and if strings are provided, it willcreate one Variable for each string, using the string as the Variable’sname. For example:

  1. from theano.tensor import *
  2.  
  3. x, y, z = dmatrices(3) # creates three matrix Variables with no names
  4. x, y, z = dmatrices('x', 'y', 'z') # creates three matrix Variables named 'x', 'y' and 'z'

Custom tensor types

If you would like to construct a tensor variable with a non-standardbroadcasting pattern, or a larger number of dimensions you’ll need to createyour own TensorType instance. You create such an instance by passingthe dtype and broadcasting pattern to the constructor. For example, youcan create your own 8-dimensional tensor type

  1. >>> dtensor8 = TensorType('float64', (False,)*8)
  2. >>> x = dtensor8()
  3. >>> z = dtensor8('z')

You can also redefine some of the provided types and they will interactcorrectly:

  1. >>> my_dmatrix = TensorType('float64', (False,)*2)
  2. >>> x = my_dmatrix() # allocate a matrix variable
  3. >>> my_dmatrix == dmatrix
  4. True

See TensorType for more information about creating new types ofTensor.

Converting from Python Objects

Another way of creating a TensorVariable (a TensorSharedVariable to beprecise) is by calling shared()

  1. x = shared(numpy.random.randn(3,4))

This will return a shared variable whose .value isa numpy ndarray. The number of dimensions and dtype of the Variable areinferred from the ndarray argument. The argument to shared will not becopied, and subsequent changes will be reflected in x.value.

For additional information, see the shared() documentation.

Finally, when you use a numpy ndarray or a Python number together withTensorVariable instances in arithmetic expressions, the result is aTensorVariable. What happens to the ndarray or the number?Theano requires that the inputs to all expressions be Variable instances, soTheano automatically wraps them in a TensorConstant.

Note

Theano makes a copy of any ndarray that you use in an expression, sosubsequentchanges to that ndarray will not have any effect on the Theano expression.

For numpy ndarrays the dtype is given, but the broadcastable pattern must beinferred. The TensorConstant is given a type with a matching dtype,and a broadcastable pattern with a True for every shape dimension that is 1.

For python numbers, the broadcastable pattern is () but the dtype must beinferred. Python integers are stored in the smallest dtype that can holdthem, so small constants like 1 are stored in a bscalar.Likewise, Python floats are stored in an fscalar if fscalar suffices to holdthem perfectly, but a dscalar otherwise.

Note

When config.floatX==float32 (see config), then Python floatsare stored instead as single-precision floats.

For fine control of this rounding policy, seetheano.tensor.basic.autocast_float.

  • theano.tensor.astensor_variable(_x, name=None, ndim=None)[source]
  • Turn an argument x into a TensorVariable or TensorConstant.

Many tensor Ops run their arguments through this function aspre-processing. It passes through TensorVariable instances, and tries towrap other objects into TensorConstant.

When x is a Python number, the dtype is inferred as described above.

When x is a list or tuple it is passed through numpy.asarray

If the ndim argument is not None, it must be an integer and the outputwill be broadcasted if necessary in order to have this many dimensions.

Return type:TensorVariable or TensorConstant

TensorType and TensorVariable

  • class theano.tensor.TensorType(Type)[source]
  • The Type class used to mark Variables that stand for numpy.ndarray_values (_numpy.memmap, which is a subclass of numpy.ndarray, is also allowed).Recalling to the tutorial, the purple box inthe tutorial’s graph-structure figure is an instance of this class.

    • broadcastable[source]
    • A tuple of True/False values, one for each dimension. True inposition ‘i’ indicates that at evaluation-time, the ndarray will havesize 1 in that ‘i’-th dimension. Such a dimension is called abroadcastable dimension (see Broadcasting).

The broadcastable pattern indicates both the number of dimensions andwhether a particular dimension must have length 1.

Here is a table mapping some broadcastable patterns to what theymean:

patterninterpretation[]scalar[True]1D scalar (vector of length 1)[True, True]2D scalar (1x1 matrix)[False]vector[False, False]matrix[False] * nnD tensor[True, False]row (1xN matrix)[False, True]column (Mx1 matrix)[False, True, False]A Mx1xP tensor (a)[True, False, False]A 1xNxP tensor (b)[False, False, False]A MxNxP tensor (pattern of a + b)

For dimensions in which broadcasting is False, the length of thisdimension can be 1 or more. For dimensions in which broadcasting is True,the length of this dimension must be 1.

When two arguments to an element-wise operation (like addition orsubtraction) have a differentnumber of dimensions, the broadcastablepattern is expanded to the left, by padding with True. For example,a vector’s pattern, [False], could be expanded to [True, False], andwould behave like a row (1xN matrix). In the same way, a matrix ([False, False]) would behave like a 1xNxP tensor ([True, False, False]).

If we wanted to create a type representing a matrix that wouldbroadcast over the middle dimension of a 3-dimensional tensor whenadding them together, we would define it like this:

  1. >>> middle_broadcaster = TensorType('complex64', [False, True, False])
  • ndim[source]
  • The number of dimensions that a Variable’s value will have atevaluation-time. This must be known when we are building theexpression graph.

  • dtype[source]

  • A string indicatingthe numerical type of the ndarray for which a Variable of this Typeis standing.

The dtype attribute of a TensorType instance can be any of thefollowing strings.

dtypedomainbits'int8'signed integer8'int16'signed integer16'int32'signed integer32'int64'signed integer64'uint8'unsigned integer8'uint16'unsigned integer16'uint32'unsigned integer32'uint64'unsigned integer64'float32'floating point32'float64'floating point64'complex64'complex64 (two float32)'complex128'complex128 (two float64)

  • init(self, dtype, broadcastable)[source]
  • If you wish to use a type of tensor which is not already available(for example, a 5D tensor) you can build an appropriate type by instantiatingTensorType.

TensorVariable

  • class theano.tensor.TensorVariable(Variable, _tensor_py_operators)[source]
  • The result of symbolic operations typically have this type.

See _tensor_py_operators for most of the attributes and methodsyou’ll want to call.

  • class theano.tensor.TensorConstant(Variable, _tensor_py_operators)[source]
  • Python and numpy numbers are wrapped in this type.

See _tensor_py_operators for most of the attributes and methodsyou’ll want to call.

  • class theano.tensor.TensorSharedVariable(Variable, _tensor_py_operators)[source]
  • This type is returned by shared() when the value to share is a numpyndarray.

See _tensor_py_operators for most of the attributes and methodsyou’ll want to call.

  • class theano.tensor._tensor_py_operators[source]

This mix-in class adds convenient attributes, methods, and support to TensorVariable, TensorConstant and TensorSharedVariable for Python operators (see Operator Support).

type[source]

A reference to the TensorType instance describing the sort of values that might be associated with this variable.

ndim[source]

The number of dimensions of this tensor. Aliased to TensorType.ndim.

dtype[source]

The numeric type of this tensor. Aliased to TensorType.dtype.

reshape(shape, ndim=None)[source]

Returns a view of this tensor that has been reshaped as in numpy.reshape. If the shape is a Variable argument, then you might need to use the optional ndim parameter to declare how many elements the shape has, and therefore how many dimensions the reshaped Variable will have.

See reshape().

dimshuffle(pattern)[source]

Returns a view of this tensor with permuted dimensions. Typically the pattern will include the integers 0, 1, … ndim-1, and any number of ‘x’ characters in dimensions where this tensor should be broadcasted.

A few examples of patterns and their effect:

  • (‘x’) -> make a 0d (scalar) into a 1d vector
  • (0, 1) -> identity for 2d vectors
  • (1, 0) -> inverts the first and second dimensions
  • (‘x’, 0) -> make a row out of a 1d vector (N to 1xN)
  • (0, ‘x’) -> make a column out of a 1d vector (N to Nx1)
  • (2, 0, 1) -> AxBxC to CxAxB
  • (0, ‘x’, 1) -> AxB to Ax1xB
  • (1, ‘x’, 0) -> AxB to Bx1xA
  • (1,) -> This remove dimensions 0. It must be a broadcastable dimension (1xA to A)
flatten(ndim=1)[source]

Returns a view of this tensor with ndim dimensions, whose shape for the first ndim-1 dimensions will be the same as self, and shape in the remaining dimension will be expanded to fit in all the data from self.

See flatten().

ravel()[source]

return self.flatten(). For NumPy compatibility.

T[source]

Transpose of this tensor.

  1. >>> x = T.zmatrix()
  2. >>> y = 3+.2j x.T

Note

In numpy and in Theano, the transpose of a vector is exactly the same vector! Use reshape or dimshuffle to turn your vector into a row or column matrix.

{any,all}(axis=None, keepdims=False)
{sum,prod,mean}(axis=None, dtype=None, keepdims=False, accdtype=None)
{var,std,min,max,argmin,argmax}(axis=None, keepdims=False),
diagonal(offset=0, axis1=0, axis2=1)[source]
astype(dtype)[source]
take(indices, axis=None, mode='raise')[source]
copy() Return a new symbolic variable that is a copy of the variable. Does not copy the tag.
norm(L, axis=None)[source]
nonzero(self, returnmatrix=False)[source]
nonzerovalues(self)[source]
sort(self, axis=-1, kind='quicksort', order=None)[source]
argsort(self, axis=-1, kind='quicksort', order=None)[source]
clip(self, amin, a_max)[source]
conf()[source]
repeat(repeats, axis=None)[source]
round(mode="half_away_from_zero")[source]
trace()[source]
get_scalar_constant_value()[source]
zeros_like(model, dtype=None)[source]

All the above methods are equivalent to NumPy for Theano on the current tensor.

{abs,neg,lt,le,gt,ge,invert,and,or,add,sub,mul,div,truediv,floordiv}

Those elemwise operation are supported via Python syntax.

  • argmax(axis=None, keepdims=False)[source]
  • See theano.tensor.argmax.

  • argmin(axis=None, keepdims=False)[source]

  • See theano.tensor.argmin.

  • argsort(axis=-1, kind='quicksort', order=None)[source]

  • See theano.tensor.argsort.

  • broadcastable[source]

  • The broadcastable signature of this tensor.

See also

broadcasting

  • choose(choices, out=None, mode='raise')[source]
  • Construct an array from an index array and a set of arrays to choosefrom.

  • clip(a_min, a_max)[source]

  • Clip (limit) the values in an array.

  • compress(a, axis=None)[source]

  • Return selected slices only.

  • conj()[source]

  • See theano.tensor.conj.

  • conjugate()[source]

  • See theano.tensor.conj.

  • copy(name=None)[source]

  • Return a symbolic copy and optionally assign a name.

Does not copy the tags.

  • dimshuffle(*pattern)[source]
  • Reorder the dimensions of this variable, optionally insertingbroadcasted dimensions.

Parameters:pattern – List/tuple of int mixed with ‘x’ for broadcastable dimensions.

Examples

For example, to create a 3D view of a [2D] matrix, calldimshuffle([0,'x',1]). This will create a 3D view such that themiddle dimension is an implicit broadcasted dimension. To do the samething on the transpose of that matrix, call dimshuffle([1, 'x', 0]).

Notes

This function supports the pattern passed as a tuple, or as avariable-length argument (e.g. a.dimshuffle(pattern) is equivalentto a.dimshuffle(*pattern) where pattern is a list/tuple of intsmixed with ‘x’ characters).

See also

DimShuffle()

  • dtype[source]
  • The dtype of this tensor.

  • fill(value)[source]

  • Fill inputted tensor with the assigned value.

  • imag[source]

  • Return imaginary component of complex-valued tensor z

Generalizes a scalar op to tensors.

All the inputs must have the same number of dimensions. When theOp is performed, for each dimension, each input’s size for thatdimension must be the same. As a special case, it can also be 1but only if the input’s broadcastable flag is True for thatdimension. In that case, the tensor is (virtually) replicatedalong that dimension to match the size of the others.

The dtypes of the outputs mirror those of the scalar Op that isbeing generalized to tensors. In particular, if the calculationsfor an output are done inplace on an input, the output type mustbe the same as the corresponding input type (see the doc ofscalar.ScalarOp to get help about controlling the output type)

Parameters:

  1. - **scalar_op** An instance of a subclass of scalar.ScalarOp which works uniquelyon scalars.
  2. - **inplace_pattern** A dictionary that maps the index of an output to theindex of an input so the output is calculated inplace usingthe inputs storage. (Just like destroymap, but without the lists.)
  3. - **nfunc_spec** Either None or a tuple of three elements,(nfunc_name, nin, nout) such that getattr(numpy, nfunc_name)implements this operation, takes nin inputs and nout outputs.Note that nin cannot always be inferred from the scalar opsown nin field because that value is sometimes 0 (meaning avariable number of inputs), whereas the numpy function maynot have varargs.

Note

Elemwise(add) represents + on tensors (x + y)

Elemwise(add, {0 : 0}) represents the += operation (x += y)

Elemwise(add, {0 : 1}) represents += on the second argument (y += x)

Elemwise(mul)(rand(10, 5), rand(1, 5)) the second input is completed along the first dimension to match the first input

Elemwise(true_div)(rand(10, 5), rand(10, 1)) same but along the second dimension

Elemwise(int_div)(rand(1, 5), rand(10, 1)) the output has size (10, 5)

Elemwise(log)(rand(3, 4, 5))

  • max(axis=None, keepdims=False)[source]
  • See theano.tensor.max.

  • mean(axis=None, dtype=None, keepdims=False, acc_dtype=None)[source]

  • See theano.tensor.mean.

  • min(axis=None, keepdims=False)[source]

  • See theano.tensor.min.

  • ndim[source]

  • The rank of this tensor.

  • nonzero(return_matrix=False)[source]

  • See theano.tensor.nonzero.

  • nonzero_values()[source]

  • See theano.tensor.nonzero_values.

  • prod(axis=None, dtype=None, keepdims=False, acc_dtype=None)[source]

  • See theano.tensor.prod.

  • ptp(axis=None)[source]

  • See ‘theano.tensor.ptp’.

  • real[source]

  • Return real component of complex-valued tensor z

Generalizes a scalar op to tensors.

All the inputs must have the same number of dimensions. When theOp is performed, for each dimension, each input’s size for thatdimension must be the same. As a special case, it can also be 1but only if the input’s broadcastable flag is True for thatdimension. In that case, the tensor is (virtually) replicatedalong that dimension to match the size of the others.

The dtypes of the outputs mirror those of the scalar Op that isbeing generalized to tensors. In particular, if the calculationsfor an output are done inplace on an input, the output type mustbe the same as the corresponding input type (see the doc ofscalar.ScalarOp to get help about controlling the output type)

Parameters:

  1. - **scalar_op** An instance of a subclass of scalar.ScalarOp which works uniquelyon scalars.
  2. - **inplace_pattern** A dictionary that maps the index of an output to theindex of an input so the output is calculated inplace usingthe inputs storage. (Just like destroymap, but without the lists.)
  3. - **nfunc_spec** Either None or a tuple of three elements,(nfunc_name, nin, nout) such that getattr(numpy, nfunc_name)implements this operation, takes nin inputs and nout outputs.Note that nin cannot always be inferred from the scalar opsown nin field because that value is sometimes 0 (meaning avariable number of inputs), whereas the numpy function maynot have varargs.

Note

Elemwise(add) represents + on tensors (x + y)

Elemwise(add, {0 : 0}) represents the += operation (x += y)

Elemwise(add, {0 : 1}) represents += on the second argument (y += x)

Elemwise(mul)(rand(10, 5), rand(1, 5)) the second input is completed along the first dimension to match the first input

Elemwise(true_div)(rand(10, 5), rand(10, 1)) same but along the second dimension

Elemwise(int_div)(rand(1, 5), rand(10, 1)) the output has size (10, 5)

Elemwise(log)(rand(3, 4, 5))

  • repeat(repeats, axis=None)[source]
  • See theano.tensor.repeat.

  • reshape(shape, ndim=None)[source]

  • Return a reshaped view/copy of this variable.

Parameters:

  1. - **shape** Something that can be converted to a symbolic vector of integers.
  2. - **ndim** The length of the shape. Passing None here means forTheano to try and guess the length of _shape_.

Warning

This has a different signature than numpy’sndarray.reshape!In numpy you do not need to wrap the shape argumentsin a tuple, in theano you do need to.

  • round(mode=None)[source]
  • See theano.tensor.round.

  • sort(axis=-1, kind='quicksort', order=None)[source]

  • See theano.tensor.sort.

  • squeeze()[source]

  • Remove broadcastable dimensions from the shape of an array.

It returns the input array, but with the broadcastable dimensionsremoved. This is always x itself or a view into x.

  • std(axis=None, ddof=0, keepdims=False, corrected=False)[source]
  • See theano.tensor.std.

  • sum(axis=None, dtype=None, keepdims=False, acc_dtype=None)[source]

  • See theano.tensor.sum.

  • swapaxes(axis1, axis2)[source]

  • Return ‘tensor.swapaxes(self, axis1, axis2).

If a matrix is provided with the right axes, its transposewill be returned.

  • transfer(target)[source]
  • If target is ‘cpu’ this will transfer to a TensorType (ifnot already one). Other types may define additional targets.

Parameters:target (str) – The desired location of the output variable

Returns:

  1. - _object_ _tensor.transpose(self, axes)_ or _tensor.transpose(self, axes[0])_.
  2. - If only one _axes_ argument is provided and it is iterable, then it is
  3. - _assumed to be the entire axes tuple, and passed intact to_
  4. - _tensor.transpose._
  • var(axis=None, ddof=0, keepdims=False, corrected=False)[source]
  • See theano.tensor.var.

Shaping and Shuffling

To re-order the dimensions of a variable, to insert or remove broadcastabledimensions, see _tensor_py_operators.dimshuffle().

  • theano.tensor.shape(x)[source]
  • Returns an lvector representing the shape of x.
  • theano.tensor.reshape(x, newshape, ndim=None)[source]

Parameters:

  • x (any TensorVariable (or compatible__)) – variable to be reshaped
  • newshape (lvector (or compatible__)) – the new shape for x
  • ndim – optional - the length that newshape‘s value will have.If this is None, then reshape() will infer it from newshape.Return type: variable with x’s dtype, but ndim dimensions

Note

This function can infer the length of a symbolic newshape in somecases, but if it cannot and you do not provide the ndim, then thisfunction will raise an Exception.

  • theano.tensor.shapepadleft(_x, n_ones=1)[source]
  • Reshape x by left padding the shape with n_ones 1s. Note that allthis new dimension will be broadcastable. To make them non-broadcastablesee the unbroadcast().

Parameters:x (any TensorVariable (or compatible__)) – variable to be reshaped

  • theano.tensor.shapepadright(_x, n_ones=1)[source]
  • Reshape x by right padding the shape with n_ones 1s. Note that allthis new dimension will be broadcastable. To make them non-broadcastablesee the unbroadcast().

Parameters:x (any TensorVariable (or compatible__)) – variable to be reshaped

  • theano.tensor.shapepadaxis(_t, axis)[source]
  • Reshape t by inserting 1 at the dimension axis. Note that this newdimension will be broadcastable. To make it non-broadcastablesee the unbroadcast().

Parameters:

  • x (any TensorVariable (or compatible__)) – variable to be reshaped
  • axis (int) – axis where to add the new dimension to x

Example:

  1. >>> tensor = theano.tensor.tensor3()
  2. >>> theano.tensor.shape_padaxis(tensor, axis=0)
  3. InplaceDimShuffle{x,0,1,2}.0
  4. >>> theano.tensor.shape_padaxis(tensor, axis=1)
  5. InplaceDimShuffle{0,x,1,2}.0
  6. >>> theano.tensor.shape_padaxis(tensor, axis=3)
  7. InplaceDimShuffle{0,1,2,x}.0
  8. >>> theano.tensor.shape_padaxis(tensor, axis=-1)
  9. InplaceDimShuffle{0,1,2,x}.0
  • theano.tensor.unbroadcast(x, *axes)[source]
  • Make the input impossible to broadcast in the specified axes.

For example, addbroadcast(x, 0) will make the first dimensionof x broadcastable. When performing the function, if the lengthof x along that dimension is not 1, a ValueError will be raised.

We apply the opt here not to pollute the graph especially duringthe gpu optimization

Parameters:

  • x (tensor_like) – Input theano tensor.
  • axis (an int or an iterable object such as list or tuple of int values) – The dimension along which the tensor x should be unbroadcastable.If the length of x along these dimensions is not 1, a ValueError willbe raised.Returns: A theano tensor, which is unbroadcastable along the specified dimensions. Return type: tensor
  • theano.tensor.addbroadcast(x, *axes)[source]
  • Make the input broadcastable in the specified axes.

For example, addbroadcast(x, 0) will make the first dimension ofx broadcastable. When performing the function, if the length ofx along that dimension is not 1, a ValueError will be raised.

We apply the opt here not to pollute the graph especially duringthe gpu optimization

Parameters:

  • x (tensor_like) – Input theano tensor.
  • axis (an int or an iterable object such as list or tuple of int values) – The dimension along which the tensor x should be broadcastable.If the length of x along these dimensions is not 1, a ValueError willbe raised.Returns: A theano tensor, which is broadcastable along the specified dimensions. Return type: tensor
  • theano.tensor.patternbroadcast(x, broadcastable)[source]
  • Make the input adopt a specific broadcasting pattern.

Broadcastable must be iterable. For example,patternbroadcast(x, (True, False)) will make the firstdimension of x broadcastable and the second dimensionnot broadcastable, so x will now be a row.

We apply the opt here not to pollute the graph especially during the gpuoptimization.

Parameters:

  • x (tensor_like) – Input theano tensor.
  • broadcastable (an iterable object such as list or tuple of bool values) – A set of boolean values indicating whether a dimension should bebroadcastable or not. If the length of x along these dimensions isnot 1, a ValueError will be raised.Returns: A theano tensor, which is unbroadcastable along the specified dimensions. Return type: tensor
  • theano.tensor.flatten(x, ndim=1)[source]
  • Similar to reshape(), but the shape is inferred from the shape of x.

Parameters:

  • x (any TensorVariable (or compatible__)) – variable to be flattened
  • ndim (int) – the number of dimensions in the returned variableReturn type: variable with same dtype as x and ndim dimensions Returns: variable with the same shape as x in the leading ndim-1_dimensions, but with all remaining dimensions of _x collapsed intothe last dimension.

For example, if we flatten a tensor of shape (2, 3, 4, 5) with flatten(x,ndim=2), then we’ll have the same (2-1=1) leading dimensions (2,), and theremaining dimensions are collapsed. So the output in this example wouldhave shape (2, 60).

  • theano.tensor.tile(x, reps, ndim=None)[source]
  • Construct an array by repeating the input x according to _reps_pattern.

Tiles its input according to reps. The length of reps is thenumber of dimension of x and contains the number of times totile x in each dimension.

See:numpy.tiledocumentation for examples.See:theano.tensor.extra_ops.repeatNote:Currently, reps must be a constant, x.ndim andlen(reps) must be equal and, if specified, ndim must beequal to both.

  • theano.tensor.roll(x, shift, axis=None)[source]
  • Convenience function to roll TensorTypes along the given axis.

Syntax copies numpy.roll function.

Parameters:

  • x (tensor_like) – Input tensor.
  • shift (int (symbolic or literal__)) – The number of places by which elements are shifted.
  • axis (int (symbolic or literal), __optional) – The axis along which elements are shifted. By default, the arrayis flattened before shifting, after which the originalshape is restored.Returns: Output tensor, with the same shape as x. Return type: tensor

Creating Tensor

  • theano.tensor.zeroslike(_x, dtype=None)[source]

Parameters:

  • x – tensor that has the same shape as output
  • dtype – data-type, optionalBy default, it will be x.dtype.

Returns a tensor the shape of x filled with zeros of the type of dtype.

Parameters:

  • x – tensor that has the same shape as output
  • dtype – data-type, optionalBy default, it will be x.dtype.

Returns a tensor the shape of x filled with ones of the type of dtype.

  • theano.tensor.zeros(shape, dtype=None)[source]

Parameters:

  • shape – a tuple/list of scalar with the shape information.
  • dtype – the dtype of the new tensor. If None, will use floatX.

Returns a tensor filled with 0s of the provided shape.

  • theano.tensor.ones(shape, dtype=None)[source]

Parameters:

  • shape – a tuple/list of scalar with the shape information.
  • dtype – the dtype of the new tensor. If None, will use floatX.

Returns a tensor filled with 1s of the provided shape.

Parameters:

  • a – tensor that has same shape as output
  • b – theano scalar or value with which you want to fill the output

Create a matrix by filling the shape of a with b

  • theano.tensor.alloc(value, *shape)[source]

Parameters:

  • value – a value with which to fill the output
  • shape – the dimensions of the returned arrayReturns: an N-dimensional tensor initialized by value and having the specified shape.
  • theano.tensor.eye(n, m=None, k=0, dtype=theano.config.floatX)[source]

Parameters:

  • n – number of rows in output (value or theano scalar)
  • m – number of columns in output (value or theano scalar)
  • k – Index of the diagonal: 0 refers to the main diagonal,a positive value refers to an upper diagonal, and anegative value to a lower diagonal. It can be a theanoscalar.Returns: An array where all elements are equal to zero, except for the k-thdiagonal, whose values are equal to one.

Parameters:x – tensorReturns:A tensor of same shape as x that is filled with 0s everywhereexcept for the main diagonal, whose values are equal to one. The outputwill have same dtype as x.

  • theano.tensor.stack(tensors, axis=0)[source]
  • Stack tensors in sequence on given axis (default is 0).

Take a sequence of tensors and stack them on given axis to make a singletensor. The size in dimension axis of the result will be equal to the numberof tensors passed.

Parameters:

  • tensors – a list or a tuple of one or more tensors of the same rank.
  • axis – the axis along which the tensors will be stacked. Default value is 0.Returns: A tensor such that rval[0] == tensors[0], rval[1] == tensors[1], etc.

Examples:

  1. >>> a = theano.tensor.scalar()
  2. >>> b = theano.tensor.scalar()
  3. >>> c = theano.tensor.scalar()
  4. >>> x = theano.tensor.stack([a, b, c])
  5. >>> x.ndim # x is a vector of length 3.
  6. 1
  7. >>> a = theano.tensor.tensor4()
  8. >>> b = theano.tensor.tensor4()
  9. >>> c = theano.tensor.tensor4()
  10. >>> x = theano.tensor.stack([a, b, c])
  11. >>> x.ndim # x is a 5d tensor.
  12. 5
  13. >>> rval = x.eval(dict((t, np.zeros((2, 2, 2, 2))) for t in [a, b, c]))
  14. >>> rval.shape # 3 tensors are stacked on axis 0
  15. (3, 2, 2, 2, 2)

We can also specify different axis than default value 0

  1. >>> x = theano.tensor.stack([a, b, c], axis=3)
  2. >>> x.ndim
  3. 5
  4. >>> rval = x.eval(dict((t, np.zeros((2, 2, 2, 2))) for t in [a, b, c]))
  5. >>> rval.shape # 3 tensors are stacked on axis 3
  6. (2, 2, 2, 3, 2)
  7. >>> x = theano.tensor.stack([a, b, c], axis=-2)
  8. >>> x.ndim
  9. 5
  10. >>> rval = x.eval(dict((t, np.zeros((2, 2, 2, 2))) for t in [a, b, c]))
  11. >>> rval.shape # 3 tensors are stacked on axis -2
  12. (2, 2, 2, 3, 2)

Warning

The interface stack(*tensors) is deprecated! Usestack(tensors, axis=0) instead.

Stack tensors in sequence vertically (row wise).

Take a sequence of tensors and stack them vertically to make a singletensor.

Parameters:tensors – one or more tensors of the same rankReturns:A tensor such that rval[0] == tensors[0], rval[1] == tensors[1], etc.

  1. >>> x0 = T.scalar()
  2. >>> x1 = T.scalar()
  3. >>> x2 = T.scalar()
  4. >>> x = T.stack(x0, x1, x2)
  5. >>> x.ndim # x is a vector of length 3.
  6. 1
  • theano.tensor.concatenate(tensor_list, axis=0)[source]

Parameters:

  • tensor_list (a list or tuple of Tensors that all have the same shape in the axesnot specified by the axis argument.) – one or more Tensors to be concatenated together into one.
  • axis (literal or symbolic integer) – Tensors will be joined along this axis, so they may have differentshape[axis]
  1. >>> x0 = T.fmatrix()
  2. >>> x1 = T.ftensor3()
  3. >>> x2 = T.fvector()
  4. >>> x = T.concatenate([x0, x1[0], T.shape_padright(x2)], axis=1)
  5. >>> x.ndim
  6. 2
  • theano.tensor.stacklists(tensor_list)[source]

Parameters:tensor_list (an iterable that contains either tensors or otheriterables of the same type as tensor_list (in other words, thisis a tree whose leaves are tensors).) – tensors to be stacked together.

Recursively stack lists of tensors to maintain similar structure.

This function can create a tensor from a shaped list of scalars:

  1. >>> from theano.tensor import stacklists, scalars, matrices
  2. >>> from theano import function
  3. >>> a, b, c, d = scalars('abcd')
  4. >>> X = stacklists([[a, b], [c, d]])
  5. >>> f = function([a, b, c, d], X)
  6. >>> f(1, 2, 3, 4)
  7. array([[ 1., 2.],
  8. [ 3., 4.]])

We can also stack arbitrarily shaped tensors. Here we stack matrices intoa 2 by 2 grid:

  1. >>> from numpy import ones
  2. >>> a, b, c, d = matrices('abcd')
  3. >>> X = stacklists([[a, b], [c, d]])
  4. >>> f = function([a, b, c, d], X)
  5. >>> x = ones((4, 4), 'float32')
  6. >>> f(x, x, x, x).shape
  7. (2, 2, 4, 4)
  • theano.tensor.basic.choose(a, choices, out=None, mode='raise')[source]
  • Construct an array from an index array and a set of arrays to choose from.

First of all, if confused or uncertain, definitely look at the Examples -in its full generality, this function is less simple than it might seemfrom the following code description (below ndi = numpy.lib.index_tricks):

np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)]).

But this omits some subtleties. Here is a fully general summary:

Given an index array (a) of integers and a sequence of n arrays(choices), a and each choice array are first broadcast, as necessary,to arrays of a common shape; calling these Ba andBchoices[i], i = 0,…,n-1 we have that, necessarily,Ba.shape == Bchoices[i].shape for each i.Then, a new array with shape Ba.shape is created as follows:

  • if mode=raise (the default), then, first of all, each element of a(and thus Ba) must be in the range [0, n-1]; now, suppose thati (in that range) is the value at the (j0, j1, …, jm) position in Ba -then the value at the same position in the new array is the value inBchoices[i] at that same position;
  • if mode=wrap, values in a (and thus Ba) may be any (signed) integer;modular arithmetic is used to map integers outside the range [0, n-1]back into that range; and then the new array is constructed as above;
  • if mode=clip, values in a (and thus Ba) may be any (signed) integer;negative integers are mapped to 0; values greater than n-1 are mappedto n-1; and then the new array is constructed as above.

Parameters:

  • a (int array) – This array must contain integers in [0, n-1], where n is the number ofchoices, unless mode=wrap or mode=clip, in which cases any integersare permissible.
  • choices (sequence of arrays) – Choice arrays. a and all of the choices must be broadcastable tothe same shape. If choices is itself an array (not recommended),then its outermost dimension (i.e., the one corresponding tochoices.shape[0]) is taken as defining the sequence.
  • out (array, optional) – If provided, the result will be inserted into this array.It should be of the appropriate shape and dtype.
  • mode ({raise (default), wrap, clip}, optional) – Specifies how indices outside [0, n-1] will be treated:raise : an exception is raisedwrap : value becomes value mod nclip : values < 0 are mapped to 0, values > n-1 are mapped to n-1Returns: The merged result. Return type: mergedarray - array Raises: _ValueError - shape mismatch – If a and each choice array are not all broadcastable to the same shape.

Reductions

  • theano.tensor.max(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to compute the maximumParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:maximum of x along axis

  • axis can be:
    • None - in which case the maximum is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.argmax(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis along which to compute the index of the maximumParameter:keepdims - (boolean) If this is set to True, the axis which is reduced isleft in the result as a dimension with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:the index of the maximum value along a given axis

  • if axis=None, Theano 0.5rc1 or later: argmax over the flattened tensor (like numpy)
  • older: then axis is assumed to be ndim(x)-1
  • theano.tensor.maxand_argmax(_x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis along which to compute the maximum and its indexParameter:keepdims - (boolean) If this is set to True, the axis which is reduced isleft in the result as a dimension with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:the maximum value along a given axis and its index.

  • if axis=None, Theano 0.5rc1 or later: max_and_argmax over the flattened tensor (like numpy)
  • older: then axis is assumed to be ndim(x)-1
  • theano.tensor.min(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to compute the minimumParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:minimum of x along axis

  • axis can be:
    • None - in which case the minimum is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.argmin(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis along which to compute the index of the minimumParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:the index of the minimum value along a given axis

  • if axis=None, Theano 0.5rc1 or later: argmin over the flattened tensor (like numpy)
  • older: then axis is assumed to be ndim(x)-1
  • theano.tensor.sum(x, axis=None, dtype=None, keepdims=False, acc_dtype=None)[source]

Parameter: x - symbolic Tensor (or compatible) Parameter: axis - axis or axes along which to compute the sum Parameter: dtype - The dtype of the returned tensor.If None, then we use the default dtype which is the same asthe input tensor’s dtype except when:

  • the input dtype is a signed integer of precision < 64 bit, inwhich case we use int64
  • the input dtype is an unsigned integer of precision < 64 bit, inwhich case we use uint64 This default dtype does not depend on the value of “accdtype”. Parameter: _keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor. Parameter: acc_dtype - The dtype of the internal accumulator.If None (default), we use the dtype in the list below,or the input dtype if its precision is higher:

  • for int dtypes, we use at least int64;

  • for uint dtypes, we use at least uint64;
  • for float dtypes, we use at least float64;
  • for complex dtypes, we use at least complex128.Returns: sum of x along axis
  • axis can be:
    • None - in which case the sum is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.prod(x, axis=None, dtype=None, keepdims=False, acc_dtype=None, no_zeros_in_input=False)[source]

Parameter: x - symbolic Tensor (or compatible) Parameter: axis - axis or axes along which to compute the product Parameter: dtype - The dtype of the returned tensor.If None, then we use the default dtype which is the same asthe input tensor’s dtype except when:

  • the input dtype is a signed integer of precision < 64 bit, inwhich case we use int64
  • the input dtype is an unsigned integer of precision < 64 bit, inwhich case we use uint64 This default dtype does not depend on the value of “accdtype”. Parameter: _keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor. Parameter: acc_dtype - The dtype of the internal accumulator.If None (default), we use the dtype in the list below,or the input dtype if its precision is higher:

  • for int dtypes, we use at least int64;

  • for uint dtypes, we use at least uint64;
  • for float dtypes, we use at least float64;
  • for complex dtypes, we use at least complex128.Parameter: no_zeros_in_input - The grad of prod is complicatedas we need to handle 3 different cases: without zeros in theinput reduced group, with 1 zero or with more zeros.

This could slow you down, but more importantly, we currentlydon’t support the second derivative of the 3 cases. So youcannot take the second derivative of the default prod().

To remove the handling of the special cases of 0 and so getsome small speed up and allow second derivative setno_zeros_in_inputs to True. It defaults to False.

It is the user responsibility to make sure there are no zerosin the inputs. If there are, the grad will be wrong. Returns: product of every term in x along axis

  • axis can be:
    • None - in which case the sum is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.mean(x, axis=None, dtype=None, keepdims=False, acc_dtype=None)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to compute the meanParameter:dtype - The dtype to cast the result of the inner summation into.For instance, by default, a sum of a float32 tensor will bedone in float64 (accdtype would be float64 by default),but that result will be casted back in float32.Parameter:_keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Parameter:acc_dtype - The dtype of the internal accumulator of theinner summation. This will not necessarily be the dtype of theoutput (in particular if it is a discrete (int/uint) dtype, theoutput will be in a float type). If None, then we use the samerules as sum().Returns:mean value of x along axis

  • axis can be:
    • None - in which case the mean is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.var(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to compute the varianceParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:variance of x along axis

  • axis can be:
    • None - in which case the variance is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.std(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to compute the standard deviationParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:variance of x along axis

  • axis can be:
    • None - in which case the standard deviation is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.all(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to apply ‘bitwise and’Parameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:bitwise and of x along axis

  • axis can be:
    • None - in which case the ‘bitwise and’ is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.any(x, axis=None, keepdims=False)[source]

Parameter:x - symbolic Tensor (or compatible)Parameter:axis - axis or axes along which to apply bitwise orParameter:keepdims - (boolean) If this is set to True, the axes which are reduced areleft in the result as dimensions with size one. With this option, the resultwill broadcast correctly against the original tensor.Returns:bitwise or of x along axis

  • axis can be:
    • None - in which case the ‘bitwise or’ is computed along all axes (like numpy)
    • an int - computed along this axis
    • a list of ints - computed along these axes
  • theano.tensor.ptp(x, axis = None)[source]
  • Range of values (maximum - minimum) along an axis.The name of the function comes from the acronym for peak to peak.

Parameter:x Input tensor.Parameter:axis Axis along which to find the peaks. By default,flatten the array.Returns:A new array holding the result.

Indexing

Like NumPy, Theano distinguishes between basic and advanced indexing.Theano fully supports basic indexing(see NumPy’s indexing)and integer advanced indexing.Since version 0.10.0 Theano also supports boolean indexing with booleanNumPy arrays or Theano tensors.

Index-assignment is not supported. If you want to do something like a[5] = b or a[5]+=b, see theano.tensor.set_subtensor() and theano.tensor.inc_subtensor() below.

  • theano.tensor.setsubtensor(_x, y, inplace=False, tolerate_inplace_aliasing=False)[source]
  • Return x with the given subtensor overwritten by y.

Parameters:

  • x – Symbolic variable for the lvalue of = operation.
  • y – Symbolic variable for the rvalue of = operation.
  • tolerate_inplace_aliasing – See inc_subtensor for documentation.

Examples

To replicate the numpy expression “r[10:] = 5”, type

  1. >>> r = ivector()
  2. >>> new_r = set_subtensor(r[10:], 5)
  • theano.tensor.incsubtensor(_x, y, inplace=False, set_instead_of_inc=False, tolerate_inplace_aliasing=False)[source]
  • Return x with the given subtensor incremented by y.

Parameters:

  • x – The symbolic result of a Subtensor operation.
  • y – The amount by which to increment the subtensor in question.
  • inplace – Don’t use. Theano will do it when possible.
  • set_instead_of_inc – If True, do a set_subtensor instead.
  • tolerate_inplace_aliasing – Allow x and y to be views of a single underlying array even whileworking inplace. For correct results, x and y must not be overlappingviews; if they overlap, the result of this Op will generally beincorrect. This value has no effect if inplace=False.

Examples

To replicate the numpy expression “r[10:] += 5”, type

  1. >>> r = ivector()
  2. >>> new_r = inc_subtensor(r[10:], 5)

Operator Support

Many Python operators are supported.

  1. >>> a, b = T.itensor3(), T.itensor3() # example inputs

Arithmetic

  1. >>> a + 3 # T.add(a, 3) -> itensor3
  2. >>> 3 - a # T.sub(3, a)
  3. >>> a * 3.5 # T.mul(a, 3.5) -> ftensor3 or dtensor3 (depending on casting)
  4. >>> 2.2 / a # T.truediv(2.2, a)
  5. >>> 2.2 // a # T.intdiv(2.2, a)
  6. >>> 2.2**a # T.pow(2.2, a)
  7. >>> b % a # T.mod(b, a)

Bitwise

  1. >>> a & b # T.and_(a,b) bitwise and (alias T.bitwise_and)
  2. >>> a ^ 1 # T.xor(a,1) bitwise xor (alias T.bitwise_xor)
  3. >>> a | b # T.or_(a,b) bitwise or (alias T.bitwise_or)
  4. >>> ~a # T.invert(a) bitwise invert (alias T.bitwise_not)

Inplace

In-place operators are not supported. Theano’s graph-optimizationswill determine which intermediate values to use for in-placecomputations. If you would like to update the value of ashared variable, consider using the updates argument totheano.function().

Elementwise

Casting

  • theano.tensor.cast(x, dtype)[source]
  • Cast any tensor x to a Tensor of the same shape, but with a differentnumerical type dtype.

This is not a reinterpret cast, but a coercion cast, similar tonumpy.asarray(x, dtype=dtype).

  1. import theano.tensor as T
  2. x = T.matrix()
  3. x_as_int = T.cast(x, 'int32')

Attempting to casting a complex value to a real value is ambiguous andwill raise an exception. Use real(), imag(), abs(), or angle().

  • theano.tensor.real(x)[source]
  • Return the real (not imaginary) components of Tensor x.For non-complex x this function returns x.
  • theano.tensor.imag(x)[source]
  • Return the imaginary components of Tensor x.For non-complex x this function returns zeros_like(x).

Comparisons

  • The six usual equality and inequality operators share the same interface.

Parameter:a - symbolic Tensor (or compatible)Parameter:b - symbolic Tensor (or compatible)Return type:symbolic TensorReturns:a symbolic tensor representing the application of the logical elementwise operator.

Note

Theano has no boolean dtype. Instead, all boolean tensors are representedin 'int8'.

Here is an example with the less-than operator.

  1. import theano.tensor as T
  2. x,y = T.dmatrices('x','y')
  3. z = T.le(x,y)
  • theano.tensor.lt(a, b)[source]
  • Returns a symbolic 'int8' tensor representing the result of logical less-than (a

  • theano.tensor.gt(a, b)[source]

  • Returns a symbolic 'int8' tensor representing the result of logical greater-than (a>b).

Also available using syntax a > b

  • theano.tensor.le(a, b)[source]
  • Returns a variable representing the result of logical less than or equal (a<=b).

Also available using syntax a <= b

  • theano.tensor.ge(a, b)[source]
  • Returns a variable representing the result of logical greater or equal than (a>=b).

Also available using syntax a >= b

  • theano.tensor.eq(a, b)[source]
  • Returns a variable representing the result of logical equality (a==b).
  • theano.tensor.neq(a, b)[source]
  • Returns a variable representing the result of logical inequality (a!=b).
  • theano.tensor.isnan(a)[source]
  • Returns a variable representing the comparison of a elements with nan.

This is equivalent to numpy.isnan.

  • theano.tensor.isinf(a)[source]
  • Returns a variable representing the comparison of a elementswith inf or -inf.

This is equivalent to numpy.isinf.

  • theano.tensor.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)[source]
  • Returns a symbolic 'int8' tensor representing where two tensors are equalwithin a tolerance.

The tolerance values are positive, typically very small numbers.The relative difference (rtol * abs(b)) and the absolute difference atol areadded together to compare against the absolute difference between a and b.

For finite values, isclose uses the following equation to test whether twofloating point values are equivalent:|a - b| <= (atol + rtol * |b|)

For infinite values, isclose checks if both values are the same signed inf value.

If equal_nan is True, isclose considers NaN values in the same position to be close.Otherwise, NaN values are not considered close.

This is equivalent to numpy.isclose.

  • theano.tensor.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)[source]
  • Returns a symbolic 'int8' value representing if all elements in two tensors are equalwithin a tolerance.

See notes in isclose for determining values equal within a tolerance.

This is equivalent to numpy.allclose.

Condition

  • theano.tensor.switch(cond, ift, iff)[source]
    • Returns a variable representing a switch between ift (iftrue) and iff (iffalse)
    • based on the condition cond. This is the theano equivalent of numpy.where.
Parameter:cond - symbolic Tensor (or compatible)
Parameter:ift - symbolic Tensor (or compatible)
Parameter:iff - symbolic Tensor (or compatible)
Return type:symbolic Tensor
  1. import theano.tensor as T
  2. a,b = T.dmatrices('a','b')
  3. x,y = T.dmatrices('x','y')
  4. z = T.switch(T.lt(a,b), x, y)
  • theano.tensor.where(cond, ift, iff)[source]
  • Alias for switch. where is the numpy name.
  • theano.tensor.clip(x, min, max)[source]
  • Return a variable representing x, but with all elements greater thanmax clipped to max and all elements less than min clipped to min.

Normal broadcasting rules apply to each of x, min, and max.

Bit-wise

  • The bitwise operators possess this interface:

Parameter:a - symbolic Tensor of integer type.Parameter:b - symbolic Tensor of integer type.

Note

The bitwise operators must have an integer type as input.

The bit-wise not (invert) takes only one parameter.

Return type:symbolic Tensor with corresponding dtype.

  • theano.tensor.and(_a, b)[source]
  • Returns a variable representing the result of the bitwise and.
  • theano.tensor.or(_a, b)[source]
  • Returns a variable representing the result of the bitwise or.
  • theano.tensor.xor(a, b)[source]
  • Returns a variable representing the result of the bitwise xor.
  • theano.tensor.invert(a)[source]
  • Returns a variable representing the result of the bitwise not.
  • theano.tensor.bitwiseand(_a, b)[source]
  • Alias for and_. bitwise_and is the numpy name.
  • theano.tensor.bitwiseor(_a, b)[source]
  • Alias for or_. bitwise_or is the numpy name.
  • theano.tensor.bitwisexor(_a, b)[source]
  • Alias for xor_. bitwise_xor is the numpy name.
  • theano.tensor.bitwisenot(_a, b)[source]
  • Alias for invert. invert is the numpy name.

Here is an example using the bit-wise and_ via the & operator:

  1. import theano.tensor as T
  2. x,y = T.imatrices('x','y')
  3. z = x & y

Mathematical

  • theano.tensor.abs(_a)[source]
  • Returns a variable representing the absolute of a, ie |a|.

Note

Can also be accessed with abs(a).

  • theano.tensor.angle(a)[source]
  • Returns a variable representing angular component of complex-valued Tensor a.
  • theano.tensor.exp(a)[source]
  • Returns a variable representing the exponential of a, ie e^a.
  • theano.tensor.maximum(a, b)[source]
  • Returns a variable representing the maximum element by element of a and b
  • theano.tensor.minimum(a, b)[source]
  • Returns a variable representing the minimum element by element of a and b
  • theano.tensor.neg(a)[source]
  • Returns a variable representing the negation of a (also -a).
  • theano.tensor.inv(a)[source]
  • Returns a variable representing the inverse of a, ie 1.0/a. Also called reciprocal.
  • theano.tensor.log(a), log2(a), log10(a)[source]
  • Returns a variable representing the base e, 2 or 10 logarithm of a.
  • theano.tensor.sgn(a)[source]
  • Returns a variable representing the sign of a.
  • theano.tensor.ceil(a)[source]
  • Returns a variable representing the ceiling of a (for example ceil(2.1) is 3).
  • theano.tensor.floor(a)[source]
  • Returns a variable representing the floor of a (for example floor(2.9) is 2).
  • theano.tensor.round(a, mode="half_away_from_zero")[source]
  • Returns a variable representing the rounding of a in the same dtype as a. Implemented rounding mode are half_away_from_zero and half_to_even.
  • theano.tensor.iround(a, mode="half_away_from_zero")[source]
  • Short hand for cast(round(a, mode),’int64’).
  • theano.tensor.sqr(a)[source]
  • Returns a variable representing the square of a, ie a^2.
  • theano.tensor.sqrt(a)[source]
  • Returns a variable representing the of a, ie a^0.5.
  • theano.tensor.cos(a), sin(a), tan(a)[source]
  • Returns a variable representing the trigonometric functions of a (cosine, sine and tangent).
  • theano.tensor.cosh(a), sinh(a), tanh(a)[source]
  • Returns a variable representing the hyperbolic trigonometric functions of a (hyperbolic cosine, sine and tangent).
  • theano.tensor.erf(a), erfc(a)[source]
  • Returns a variable representing the error function or the complementary error function. wikipedia
  • theano.tensor.erfinv(a), erfcinv(a)[source]
  • Returns a variable representing the inverse error function or the inverse complementary error function. wikipedia
  • theano.tensor.gamma(a)[source]
  • Returns a variable representing the gamma function.
  • theano.tensor.gammaln(a)[source]
  • Returns a variable representing the logarithm of the gamma function.
  • theano.tensor.psi(a)[source]
  • Returns a variable representing the derivative of the logarithm ofthe gamma function (also called the digamma function).
  • theano.tensor.chi2sf(a, df)[source]
  • Returns a variable representing the survival function (1-cdf —sometimes more accurate).

C code is provided in the Theano_lgpl repository.This makes it faster.

https://github.com/Theano/Theano_lgpl.git

You can find more information about Broadcasting in the Broadcasting tutorial.

Linear Algebra

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b:

Parameters:

  • X (symbolic tensor) – left term
  • Y (symbolic tensor) – right termReturn type: symbolic matrix or vector Returns: the inner product of X and Y.

Parameters:

  • X (symbolic vector) – left term
  • Y (symbolic vector) – right termReturn type: symbolic matrix Returns: vector-vector outer product
  • theano.tensor.tensordot(a, b, axes=2)[source]
  • Given two tensors a and b,tensordot computes a generalized dot product overthe provided axes. Theano’s implementation reduces all expressions tomatrix or vector dot products and is based on code from Tijmen Tieleman’sgnumpy (http://www.cs.toronto.edu/~tijmen/gnumpy.html).

Parameters:

  • a (symbolic tensor) – the first tensor variable
  • b (symbolic tensor) – the second tensor variable
  • axes (int or array-like of length 2) – an integer or array. If an integer, the number of axesto sum over. If an array, it must have two arrayelements containing the axes to sum over in each tensor.

Note that the default value of 2 is not guaranteed to workfor all values of a and b, and an error will be raised ifthat is the case. The reason for keeping the default is tomaintain the same signature as numpy’s tensordot function(and np.tensordot raises analogous errors for non-compatibleinputs).

If an integer i, it is converted to an array containingthe last i dimensions of the first tensor and the firsti dimensions of the second tensor:

axes = [range(a.ndim - i, b.ndim), range(i)]

If an array, its two elements must contain compatible axesof the two tensors. For example, [[1, 2], [2, 0]] means sumover the 2nd and 3rd axes of a and the 3rd and 1st axes of b.(Remember axes are zero-indexed!) The 2nd axis of a and the3rd axis of b must have the same shape; the same is true forthe 3rd axis of a and the 1st axis of b. Returns: a tensor with shape equal to the concatenation of a’s shape(less any dimensions that were summed over) and b’s shape(less any dimensions that were summed over). Return type: symbolic tensor

It may be helpful to consider an example to see what tensordot does.Theano’s implementation is identical to NumPy’s. Here a has shape (2, 3, 4)and b has shape (5, 6, 4, 3). The axes to sum over are [[1, 2], [3, 2]] –note that a.shape[1] == b.shape[3] and a.shape[2] == b.shape[2]; these axesare compatible. The resulting tensor will have shape (2, 5, 6) – thedimensions that are not being summed:

  1. import numpy as np
  2.  
  3. a = np.random.random((2,3,4))
  4. b = np.random.random((5,6,4,3))
  5.  
  6. #tensordot
  7. c = np.tensordot(a, b, [[1,2],[3,2]])
  8.  
  9. #loop replicating tensordot
  10. a0, a1, a2 = a.shape
  11. b0, b1, _, _ = b.shape
  12. cloop = np.zeros((a0,b0,b1))
  13.  
  14. #loop over non-summed indices -- these exist
  15. #in the tensor product.
  16. for i in range(a0):
  17. for j in range(b0):
  18. for k in range(b1):
  19. #loop over summed indices -- these don't exist
  20. #in the tensor product.
  21. for l in range(a1):
  22. for m in range(a2):
  23. cloop[i,j,k] += a[i,l,m] * b[j,k,m,l]
  24.  
  25. assert np.allclose(c, cloop)

This specific implementation avoids a loop by transposing a and b such thatthe summed axes of a are last and the summed axes of b are first. Theresulting arrays are reshaped to 2 dimensions (or left as vectors, ifappropriate) and a matrix or vector dot product is taken. The result isreshaped back to the required output dimensions.

In an extreme case, no axes may be specified. The resulting tensorwill have shape equal to the concatenation of the shapes of a and b:

  1. >>> c = np.tensordot(a, b, 0)
  2. >>> a.shape
  3. (2, 3, 4)
  4. >>> b.shape
  5. (5, 6, 4, 3)
  6. >>> print(c.shape)
  7. (2, 3, 4, 5, 6, 4, 3)

Note:See the documentation of numpy.tensordot for more examples.

  • theano.tensor.batcheddot(_X, Y)[source]

Parameters:

  • x – A Tensor with sizes e.g.: for 3D (dim1, dim3, dim2)
  • y – A Tensor with sizes e.g.: for 3D (dim1, dim2, dim4)

This function computes the dot product between the two tensors, by iteratingover the first dimension using scan.Returns a tensor of size e.g. if it is 3D: (dim1, dim3, dim4)Example:

  1. >>> first = T.tensor3('first')
  2. >>> second = T.tensor3('second')
  3. >>> result = batched_dot(first, second)

Note: This is a subset of numpy.einsum, but we do not provide it for now.But numpy einsum is slower than dot or tensordot:http://mail.scipy.org/pipermail/numpy-discussion/2012-October/064259.html Parameters:

  • X (symbolic tensor) – left term
  • Y (symbolic tensor) – right termReturns: tensor of products
  • theano.tensor.batchedtensordot(_X, Y, axes=2)[source]

Parameters:

  • x – A Tensor with sizes e.g.: for 3D (dim1, dim3, dim2)
  • y – A Tensor with sizes e.g.: for 3D (dim1, dim2, dim4)
  • axes (int or array-like of length 2) – an integer or array. If an integer, the number of axesto sum over. If an array, it must have two arrayelements containing the axes to sum over in each tensor.

If an integer i, it is converted to an array containingthe last i dimensions of the first tensor and the firsti dimensions of the second tensor (excluding the first(batch) dimension):

  1. axes = [range(a.ndim - i, b.ndim), range(1,i+1)]

If an array, its two elements must contain compatible axesof the two tensors. For example, [[1, 2], [2, 4]] means sumover the 2nd and 3rd axes of a and the 3rd and 5th axes of b.(Remember axes are zero-indexed!) The 2nd axis of a and the3rd axis of b must have the same shape; the same is true forthe 3rd axis of a and the 5th axis of b. Returns: a tensor with shape equal to the concatenation of a’s shape(less any dimensions that were summed over) and b’s shape(less first dimension and any dimensions that were summed over). Return type: tensor of tensordots

A hybrid of batch_dot and tensordot, this function computes thetensordot product between the two tensors, by iterating over thefirst dimension using scan to perform a sequence of tensordots.

Note:See tensordot() and batched_dot() forsupplementary documentation.

Returns:an instance which returns a dense (or fleshed out) mesh-gridwhen indexed, so that each returned argument has the same shape.The dimensions and number of the output arrays are equal to thenumber of indexing dimensions. If the step length is not a complexnumber, then the stop is not inclusive.

Example:

  1. >>> a = T.mgrid[0:5, 0:3]
  2. >>> a[0].eval()
  3. array([[0, 0, 0],
  4. [1, 1, 1],
  5. [2, 2, 2],
  6. [3, 3, 3],
  7. [4, 4, 4]])
  8. >>> a[1].eval()
  9. array([[0, 1, 2],
  10. [0, 1, 2],
  11. [0, 1, 2],
  12. [0, 1, 2],
  13. [0, 1, 2]])

Returns:an instance which returns an open (i.e. not fleshed out) mesh-gridwhen indexed, so that only one dimension of each returned array isgreater than 1. The dimension and number of the output arrays areequal to the number of indexing dimensions. If the step length isnot a complex number, then the stop is not inclusive.

Example:

  1. >>> b = T.ogrid[0:5, 0:3]
  2. >>> b[0].eval()
  3. array([[0],
  4. [1],
  5. [2],
  6. [3],
  7. [4]])
  8. >>> b[1].eval()
  9. array([[0, 1, 2]])

Gradient / Differentiation

Driver for gradient calculations.

  • theano.gradient.grad(cost, wrt, consider_constant=None, disconnected_inputs='raise', add_names=True, known_grads=None, return_disconnected='zero', null_gradients='raise')[source]
  • Return symbolic gradients of one cost with respect to one or more variables.

For more information about how automatic differentiation works in Theano,see gradient. For information on how to implement the gradient ofa certain Op, see grad().

Parameters:

  • cost (Variable scalar (0-dimensional) tensor variable or None) – Value that we are differentiating (that we want the gradient of).May be None if known_grads is provided.
  • wrt (Variable or list of Variables) – Term[s] with respect to which we want gradients
  • consider_constant (list of variables) – Expressions not to backpropagate through
  • disconnected_inputs ({'ignore', 'warn', 'raise'}) – Defines the behaviour if some of the variables in wrt arenot part of the computational graph computing cost (or ifall links are non-differentiable). The possible values are:

    • ‘ignore’: considers that the gradient on these parameters is zero.
    • ‘warn’: consider the gradient zero, and print a warning.
    • ‘raise’: raise DisconnectedInputError.
  • add_names (bool) – If True, variables generated by grad will be named(d/d) provided that both cost and wrthave names
  • known_grads (OrderedDict, optional) – A ordered dictionary mapping variables to their gradients. This isuseful in the case where you know the gradient on somevariables but do not know the original cost.
  • return_disconnected ({'zero', 'None', 'Disconnected'}) –
    • ‘zero’ : If wrt[i] is disconnected, return value i will bewrt[i].zeros_like()
    • ‘None’ : If wrt[i] is disconnected, return value i will beNone
    • ‘Disconnected’ : returns variables of type DisconnectedType
  • null_gradients ({'raise', 'return'}) – Defines the behaviour if some of the variables in wrt have anull gradient. The possibles values are:

    • ‘raise’ : raise a NullTypeGradError exception
    • ‘return’ : return the null gradientsReturns: Symbolic expression of gradient of cost with respect to eachof the wrt terms. If an element of wrt is notdifferentiable with respect to the output, then a zerovariable is returned. Return type: variable or list/tuple of variables (matches wrt)

See the gradient page for complete documentationof the gradient module.