Parallelisation in Python with Joblib

In all computationally intensive tasks, sooner or later, the topic of parallelisation comes into focus. Python offers a variety of ways to achieve this – all with strengths but also weaknesses. In the following, I want to present some approaches with a focus on one framework: joblib. It allows the parallelisation and reusability of intermediate...

In all computationally intensive tasks, sooner or later, the topic of parallelisation comes into focus. Python offers a variety of ways to achieve this – all with strengths but also weaknesses. In the following, I want to present some approaches with a focus on one framework: joblib. It allows the parallelisation and reusability of intermediate results without much overhead. It’s particularly suitable for the parallelisation of existing code.

TL;DR

Installation of joblib:

conda install joblib

With joblib you can add parallelisation and lazy evaluation to your existing code with ease:

from joblib import Parallel, delayed
from math import sqrt

Parallel(n_jobs=1)(delayed(sqrt)(i**2) for i in range(10))

# Output: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

Parallelisation Frameworks for Python

There are several parallelisation frameworks for Python, which differ more or less in their purpose:

  • Dask parallelises tasks across processes and nodes and is thus suitable for use on a single computer, a data centre or cloud cluster. To achieve this functionality, Dask is based on special data structures (similar to Apache Spark).
  • Multiprocessing parallelises tasks on a single computer. Especially for small tasks, it is simpler and has less overhead than Dask, but is also less flexible and scalable.
  • Numba is a just-in-time (JIT) compiler for Python that can significantly speed up code execution. Especially code based on Numpy in combination with high-level language features can be optimised significantly.
  • joblib has a certain similarity to Dask, is perhaps not quite as powerful, but easier to use. A special feature is the reusability of cached results, which is especially useful for recursions.
  • PyOpenCL shifts the calculation of arrays to a GPU to parallelise calculations. This makes the library very different from the others and can be used in conjunction with Numba or Dask.

Which library to choose depends very much on your use case. Each library has its advantages and trade-offs. So there is no all-purpose solution.

⚠ Except for PyOpenCL, it is recommended to use only one of these frameworks – mixing them can lead to undesirable side effects.

Joblib

Joblib is a lightweight framework that allows you to add lazy evaluation, caching and parallelisation to computational pipelines.The main features are:

Caching

Transparent and efficient caching of result values for Python functions – for Python objects of any type and size. Efficient caching means that repeated computationally intensive calculations are only performed once. Joblib takes the caching logic off your hands and you can concentrate fully on the domain logic:

from joblib import Memory
import numpy as np

cachedir = 'your_cache_dir_goes_here'
mem = Memory(cachedir)

a = np.vander(np.arange(3)).astype(np.float)
square = mem.cache(np.square)

b = square(a)                                   
# The first call triggers the initial evaluation
c = square(a)
# The second call doesn't trigger an evaluation
# it just reuses the result from the first call

Parallelisation

With the parallel function wrapper readable parallelised function calls can be implemented. In particular, for-loops can be easily parallelised in this way:

from joblib import Parallel, delayed
from math import sqrt

Parallel(n_jobs=1)(delayed(sqrt)(i**2) for i in range(10))

# Output: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

Efficient persistence

A faster and better alternative to pickle to persist Python objects efficiently ( joblib.dump & joblib.load ).

Conclusion

There are many ways to parallelise your code in Python – all with certain strengths and tradeoffs. Joblib is one of these ways and allows you to parallelise code on a single copmuter and store results for reuse without much overhead. It’s quite intuitive to use and allows to parallelise existing code with ease.

Related Articles

Post a comment

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.