Broadcasting

Broadcasting is a mechanism which allows tensors withdifferent numbers of dimensions to be added or multipliedtogether by (virtually) replicating the smaller tensor alongthe dimensions that it is lacking.

Broadcasting is the mechanism by which a scalarmay be added to a matrix, a vector to a matrix or a scalar toa vector.

../_images/bcast.png

Broadcasting a row matrix. T and F respectively stand forTrue and False and indicate along which dimensions we allowbroadcasting.

If the second argument were a vector, its shape would be(2,) and its broadcastable pattern (False,). They wouldbe automatically expanded to the left to match thedimensions of the matrix (adding 1 to the shape and Trueto the pattern), resulting in (1, 2) and (True, False).It would then behave just like the example above.

Unlike numpy which does broadcasting dynamically, Theano needsto know, for any operation which supports broadcasting, whichdimensions will need to be broadcasted. When applicable, thisinformation is given in the Type of a Variable.

The following code illustrates how rows and columns are broadcasted in order to perform an addition operation with a matrix:

  1. >>> r = T.row()
  2. >>> r.broadcastable
  3. (True, False)
  4. >>> mtr = T.matrix()
  5. >>> mtr.broadcastable
  6. (False, False)
  7. >>> f_row = theano.function([r, mtr], [r + mtr])
  8. >>> R = np.arange(3).reshape(1, 3)
  9. >>> R
  10. array([[0, 1, 2]])
  11. >>> M = np.arange(9).reshape(3, 3)
  12. >>> M
  13. array([[0, 1, 2],
  14. [3, 4, 5],
  15. [6, 7, 8]])
  16. >>> f_row(R, M)
  17. [array([[ 0., 2., 4.],
  18. [ 3., 5., 7.],
  19. [ 6., 8., 10.]])]
  20. >>> c = T.col()
  21. >>> c.broadcastable
  22. (False, True)
  23. >>> f_col = theano.function([c, mtr], [c + mtr])
  24. >>> C = np.arange(3).reshape(3, 1)
  25. >>> C
  26. array([[0],
  27. [1],
  28. [2]])
  29. >>> M = np.arange(9).reshape(3, 3)
  30. >>> f_col(C, M)
  31. [array([[ 0., 1., 2.],
  32. [ 4., 5., 6.],
  33. [ 8., 9., 10.]])]

In these examples, we can see that both the row vector and the column vector are broadcasted in order to be be added to the matrix.

See also: