NumPy refresher

Matrix conventions for machine learning

Rows are horizontal and columns are vertical.Every row is an example. Therefore, inputs[10,5] is a matrix of 10 exampleswhere each example has dimension 5. If this would be the input of aneural network then the weights from the input to the first hiddenlayer would represent a matrix of size (5, #hid).

Consider this array:

  1. >>> numpy.asarray([[1., 2], [3, 4], [5, 6]])
  2. array([[ 1., 2.],
  3. [ 3., 4.],
  4. [ 5., 6.]])
  5. >>> numpy.asarray([[1., 2], [3, 4], [5, 6]]).shape
  6. (3, 2)

This is a 3x2 matrix, i.e. there are 3 rows and 2 columns.

To access the entry in the 3rd row (row #2) and the 1st column (column #0):

  1. >>> numpy.asarray([[1., 2], [3, 4], [5, 6]])[2, 0]
  2. 5.0

To remember this, keep in mind that we read left-to-right, top-to-bottom,so each thing that is contiguous is a row. That is, there are 3 rowsand 2 columns.

Broadcasting

Numpy does broadcasting of arrays of different shapes duringarithmetic operations. What this means in general is that the smallerarray (or scalar) is broadcasted across the larger array so that they havecompatible shapes. The example below shows an instance ofbroadcastaing:

  1. >>> a = numpy.asarray([1.0, 2.0, 3.0])
  2. >>> b = 2.0
  3. >>> a * b
  4. array([ 2., 4., 6.])

The smaller array b (actually a scalar here, which works like a 0-d array) in this case is broadcasted to the same sizeas a during the multiplication. This trick is often useful insimplifying how expression are written. More detail about _broadcasting_can be found in the numpy user guide.