Baby Steps - Algebra

Adding two Scalars

To get us started with Theano and get a feel of what we’re working with,let’s make a simple function: add two numbers together. Here is how you doit:

  1. >>> import numpy
  2. >>> import theano.tensor as T
  3. >>> from theano import function
  4. >>> x = T.dscalar('x')
  5. >>> y = T.dscalar('y')
  6. >>> z = x + y
  7. >>> f = function([x, y], z)

And now that we’ve created our function we can use it:

  1. >>> f(2, 3)
  2. array(5.0)
  3. >>> numpy.allclose(f(16.3, 12.1), 28.4)
  4. True

Let’s break this down into several steps. The first step is to definetwo symbols (Variables) representing the quantities that you wantto add. Note that from now on, we will use the termVariable to mean “symbol” (in other words,x, y, z are all Variable objects). The output of the functionf is a numpy.ndarray with zero dimensions.

If you are following along and typing into an interpreter, you may havenoticed that there was a slight delay in executing the functioninstruction. Behind the scene, f was being compiled into C code.

Step 1

  1. >>> x = T.dscalar('x')
  2. >>> y = T.dscalar('y')

In Theano, all symbols must be typed. In particular, T.dscalaris the type we assign to “0-dimensional arrays (scalar) of doubles(d)”. It is a Theano Type.

dscalar is not a class. Therefore, neither x nor y_are actually instances of dscalar. They are instances ofTensorVariable. _x and _y_are, however, assigned the theano Type dscalar in their typefield, as you can see here:

  1. >>> type(x)
  2. <class 'theano.tensor.var.TensorVariable'>
  3. >>> x.type
  4. TensorType(float64, scalar)
  5. >>> T.dscalar
  6. TensorType(float64, scalar)
  7. >>> x.type is T.dscalar
  8. True

By calling T.dscalar with a string argument, you create aVariable representing a floating-point scalar quantity with thegiven name. If you provide no argument, the symbol will be unnamed. Namesare not required, but they can help debugging.

More will be said in a moment regarding Theano’s inner structure. Youcould also learn more by looking into Graph Structures.

Step 2

The second step is to combine x and y into their sum z:

  1. >>> z = x + y

z is yet another Variable which represents the addition ofx and y. You can use the ppfunction to pretty-print out the computation associated to z.

  1. >>> from theano import pp
  2. >>> print(pp(z))
  3. (x + y)

Step 3

The last step is to create a function taking x and y as inputsand giving z as output:

  1. >>> f = function([x, y], z)

The first argument to function is a list of Variablesthat will be provided as inputs to the function. The second argumentis a single Variable or a list of Variables. For either case, the secondargument is what we want to see as output when we apply the function. f maythen be used like a normal Python function.

Note

As a shortcut, you can skip step 3, and just use a variable’seval method.The eval() method is not as flexibleas function() but it can do everything we’ve covered inthe tutorial so far. It has the added benefit of not requiringyou to import function() . Here is how eval() works:

  1. >>> import numpy
  2. >>> import theano.tensor as T
  3. >>> x = T.dscalar('x')
  4. >>> y = T.dscalar('y')
  5. >>> z = x + y
  6. >>> numpy.allclose(z.eval({x : 16.3, y : 12.1}), 28.4)
  7. True

We passed eval() a dictionary mapping symbolic theanovariables to the values to substitute for them, and it returnedthe numerical value of the expression.

eval() will be slow the first time you call it on a variable –it needs to call function() to compile the expression behindthe scenes. Subsequent calls to eval() on that same variablewill be fast, because the variable caches the compiled function.

Adding two Matrices

You might already have guessed how to do this. Indeed, the only changefrom the previous example is that you need to instantiate x andy using the matrix Types:

  1. >>> x = T.dmatrix('x')
  2. >>> y = T.dmatrix('y')
  3. >>> z = x + y
  4. >>> f = function([x, y], z)

dmatrix is the Type for matrices of doubles. Then we can useour new function on 2D arrays:

  1. >>> f([[1, 2], [3, 4]], [[10, 20], [30, 40]])
  2. array([[ 11., 22.],
  3. [ 33., 44.]])

The variable is a NumPy array. We can also use NumPy arrays directly asinputs:

  1. >>> import numpy
  2. >>> f(numpy.array([[1, 2], [3, 4]]), numpy.array([[10, 20], [30, 40]]))
  3. array([[ 11., 22.],
  4. [ 33., 44.]])

It is possible to add scalars to matrices, vectors to matrices,scalars to vectors, etc. The behavior of these operations is definedby broadcasting.

The following types are available:

  • byte: bscalar, bvector, bmatrix, brow, bcol, btensor3, btensor4
  • 16-bit integers: wscalar, wvector, wmatrix, wrow, wcol, wtensor3, wtensor4
  • 32-bit integers: iscalar, ivector, imatrix, irow, icol, itensor3, itensor4
  • 64-bit integers: lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4
  • float: fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4
  • double: dscalar, dvector, dmatrix, drow, dcol, dtensor3, dtensor4
  • complex: cscalar, cvector, cmatrix, crow, ccol, ctensor3, ctensor4

The previous list is not exhaustive and a guide to all types compatiblewith NumPy arrays may be found here: tensor creation.

Note

You, the user—not the system architecture—have to choose whether yourprogram will use 32- or 64-bit integers (i prefix vs. the l prefix)and floats (f prefix vs. the d prefix).

Exercise

  1. import theano
  2. a = theano.tensor.vector() # declare variable
  3. out = a + a ** 10 # build symbolic expression
  4. f = theano.function([a], out) # compile function
  5. print(f([0, 1, 2]))
  1. [ 0. 2. 1026.]

Modify and execute this code to compute this expression: a 2 + b 2 + 2 a b.

Solution