Thursday, October 24, 2019

Multiprocessing to change class variables in python - experiments

I am trying to build a library to help compute certain numerical computations in a parallel fashion for certain data analytics tasks.

Python's standard multiprocessing module
Summary: works well, computation time comes down drastically compared to serial computation. Uses pickle to serialise the parameters being passed to the function.
Drawback is that the class variables remain at same init state or in the same state before multiprocessing begins.

Pathos' multiprocessing module
Summary: Faster than standard multiprocessing module; uses dill instead of pickle to serialise objects.
Drawback is that the class variables remain at same init state or in the same state before multiprocessing begins.

About to try Ray library (GitHub)