Early Binding for Speed

As a dynamic language, Python encourages a programming style of consideringclasses and objects in terms of their methods and attributes, more than wherethey fit into the class hierarchy.

This can make Python a very relaxed and comfortable language for rapiddevelopment, but with a price - the ‘red tape’ of managing data types isdumped onto the interpreter. At run time, the interpreter does a lot of worksearching namespaces, fetching attributes and parsing argument and keywordtuples. This run-time ‘late binding’ is a major cause of Python’s relativeslowness compared to ‘early binding’ languages such as C++.

However with Cython it is possible to gain significant speed-ups through theuse of ‘early binding’ programming techniques.

For example, consider the following (silly) code example:

  1. cdef class Rectangle:
  2. cdef int x0, y0
  3. cdef int x1, y1
  4.  
  5. def __init__(self, int x0, int y0, int x1, int y1):
  6. self.x0 = x0
  7. self.y0 = y0
  8. self.x1 = x1
  9. self.y1 = y1
  10.  
  11. def area(self):
  12. area = (self.x1 - self.x0) * (self.y1 - self.y0)
  13. if area < 0:
  14. area = -area
  15. return area
  16.  
  17. def rectArea(x0, y0, x1, y1):
  18. rect = Rectangle(x0, y0, x1, y1)
  19. return rect.area()

In the rectArea() method, the call to rect.area() and thearea() method contain a lot of Python overhead.

However, in Cython, it is possible to eliminate a lot of this overhead in caseswhere calls occur within Cython code. For example:

  1. cdef class Rectangle:
  2. cdef int x0, y0
  3. cdef int x1, y1
  4.  
  5. def __init__(self, int x0, int y0, int x1, int y1):
  6. self.x0 = x0
  7. self.y0 = y0
  8. self.x1 = x1
  9. self.y1 = y1
  10.  
  11. cdef int _area(self):
  12. area = (self.x1 - self.x0) * (self.y1 - self.y0)
  13. if area < 0:
  14. area = -area
  15. return area
  16.  
  17. def area(self):
  18. return self._area()
  19.  
  20. def rectArea(x0, y0, x1, y1):
  21. cdef Rectangle rect = Rectangle(x0, y0, x1, y1)
  22. return rect._area()

Here, in the Rectangle extension class, we have defined two different areacalculation methods, the efficient _area() C method, and thePython-callable area() method which serves as a thin wrapper around_area(). Note also in the function rectArea() how we ‘early bind’by declaring the local variable rect which is explicitly given the typeRectangle. By using this declaration, instead of just dynamically assigning torect, we gain the ability to access the much more efficient C-callable_area() method.

But Cython offers us more simplicity again, by allowing us to declaredual-access methods - methods that can be efficiently called at C level, butcan also be accessed from pure Python code at the cost of the Python accessoverheads. Consider this code:

  1. cdef class Rectangle:
  2. cdef int x0, y0
  3. cdef int x1, y1
  4.  
  5. def __init__(self, int x0, int y0, int x1, int y1):
  6. self.x0 = x0
  7. self.y0 = y0
  8. self.x1 = x1
  9. self.y1 = y1
  10.  
  11. cpdef int area(self):
  12. area = (self.x1 - self.x0) * (self.y1 - self.y0)
  13. if area < 0:
  14. area = -area
  15. return area
  16.  
  17. def rectArea(x0, y0, x1, y1):
  18. cdef Rectangle rect = Rectangle(x0, y0, x1, y1)
  19. return rect.area()

Here, we just have a single area method, declared as cpdef to make itefficiently callable as a C function, but still accessible from pure Python(or late-binding Cython) code.

If within Cython code, we have a variable already ‘early-bound’ (ie, declaredexplicitly as type Rectangle, (or cast to type Rectangle), then invoking itsarea method will use the efficient C code path and skip the Python overhead.But if in Cython or regular Python code we have a regular object variablestoring a Rectangle object, then invoking the area method will require:

  • an attribute lookup for the area method
  • packing a tuple for arguments and a dict for keywords (both empty in this case)
  • using the Python API to call the method

and within the area method itself:

  • parsing the tuple and keywords
  • executing the calculation code
  • converting the result to a python object and returning it

So within Cython, it is possible to achieve massive optimisations byusing strong typing in declaration and casting of variables. For tight loopswhich use method calls, and where these methods are pure C, the difference canbe huge.