Types in Jug

Any type that can be pickle()d can be used with jug tasks. However, it might sometimes be wiser to break up your tasks in ways that minimise the communication necessary. For example, consider the following image processing code:

  1. from glob import glob
  2. from mahotas import imread
  3. def process(img):
  4. # complex image processing
  5. files = glob('inputs/*.png')
  6. imgs = [Task(imread, f) for f in files]
  7. props = [Task(process, img) for img in imgs]

This will work just fine, but it will save too much intermediate data. Consider rewriting this to:

  1. from glob import glob
  2. def process(f):
  3. from mahotas import imread
  4. img = imread(f)
  5. # complex image processing
  6. files = glob('inputs/*.png')
  7. props = [Task(process, f) for f in files]

I have also moved the import of mahotas.imread to inside the process function. This is another micro-optimisation: it makes jug status and friends just that little bit faster (as they do not need to perform this import to do their jobs).