Extra notes on parallelization efficiencies (#1046)

blue-yonder · Oct 24, 2023 · e2dfc6f · e2dfc6f
1 parent 3ec96dc
commit e2dfc6f
Show file tree

Hide file tree

Showing 2 changed files with 26 additions and 0 deletions.
diff --git a/docs/text/faq.rst b/docs/text/faq.rst
@@ -46,3 +46,7 @@ FAQ
  Beyond sorting, tsfresh does not use the timestamp in calculations.
  While many features do not need a timestamp (or only need it for ordering), others will assume that observations are evenly spaced in time (e.g., one second between each observation).
  Since tsfresh ignores spacing, care should be taken when selecting features to use with a highly irregular series.
+
+ 6. **Even when just extracing the :class:`tsfresh.feature_extraction.settings.EfficientFCParameters`, tsfresh is taking a long time to run. Is there anything further I can do to speed up the processing?**
+
+ If you are using Parallelization (the default option), you may need to check you are not over-provisioning your avaiable cpu cores. Take a look at :ref:`notes-for-efficient-parallelization-label` for steps to eliminate this, which can speed up processing significantly.
diff --git a/docs/text/tsfresh_on_a_cluster.rst b/docs/text/tsfresh_on_a_cluster.rst
@@ -207,3 +207,25 @@ If you want to use other framework instead of Dask, you will have to write your
 To construct your custom Distributor, you need to define an object that inherits from the abstract base class
 :class:`tsfresh.utilities.distribution.DistributorBaseClass`.
 The :mod:`tsfresh.utilities.distribution` module contains more information about what you need to implement.
+
+Notes for efficient parallelization
+'''''''''''''''''''''''''''''''''''
+
+By default tsfresh uses parallelization to distribute the single-threaded python code to the multiple cores available on the host machine.
+
+However, this can create an issue known as over-provisioning. Many of the underlying python libraries (e.g. numpy) used in the feature calculators have C code implementations for their low-level processing. Those `also` try to spread their workload between as many cores available - which is in conflict with the parallelization done by tsfresh.
+
+Over-provisioning is inefficient because of the overheads of repeated context switching.
+
+This issue can be solved by constraining the C libraries to single threads, using the following environment variables:
+
+.. code:: python
+
+ import os
+ os.environ['OMP_NUM_THREADS'] = "1"
+ os.environ['MKL_NUM_THREADS'] = "1"
+ os.environ['OPENBLAS_NUM_THREADS'] = "1"
+
+Put these lines at the beginning of your notebook/python script - before you call any tsfresh code or import any other module.
+
+The more cores your host computer has, the more improvement in processing speed will be gained by implementing these environment changes. Speed increases of between 6x and 26x have been observed depending on the type of the host machine.