Skip to content
This repository has been archived by the owner on May 8, 2021. It is now read-only.

MultiProcessing #43

Open
itzAmirali opened this issue Jul 26, 2020 · 3 comments
Open

MultiProcessing #43

itzAmirali opened this issue Jul 26, 2020 · 3 comments

Comments

@itzAmirali
Copy link

Hello,

I want to use multiple CPUs to accelerate the function.
When I use your class in multiprocessing, the kernel just freezes, and it does not do anything.

Can you help on this?

Thanks,

@YoniSchirris
Copy link

YoniSchirris commented Sep 3, 2020

I'm having the same issue. @Amiiirali did you already find a way to fix this?

I'm using PyTorch for data loading and transformations. I've added the Macenko normalization to my transformation pipeline. When I set workers=0 in my dataloader, it works well. However, whenever I increase the workers, the transform freezes when calling the transform() function.

====== EDIT =======

I've dug deeper, and I've noticed it's breaking inside spams.py in the lasso() function. Something I've noticed is that this, by default, sets numThreads=-1 which in the C++ backend means it uses all available processes / thread, as mentioned in the docstring:

numThreads: (optional, number of threads for exploiting
          multi-core / multi-cpus. By default, it takes the value -1,
          which automatically selects all the available CPUs/cores).

This probably gives an overhead when spawning this process with multiple processes already.

When changing this to numThreads=1 it works as expected.

@Peter554 I see that you have a stale branch where you removed the SPAMS dependency and used the sklearn lasso function. Is this stable, or an unfinished functionality? In either way, is it worth to create a PR to add some arguments to some functions that in the end set numThreads=1 in the SPAMS lasso regression?

@itzAmirali
Copy link
Author

@YoniSchirris Not really. I spent two days on it, but there is no any solution. The problem is the spams library. As long as you use SPAMS in the multiprocessing process, it instantly freezes. I do not know that much, but probably it depends on the C++ code of that library.

About numThreads=-1, what do you mean? if we set that, will it use all the processes?

Please keep me updated if you find a solution.

@YoniSchirris
Copy link

@Amiiirali So by itself it sets numThreads=-1, which means it'll try to use all the CPUs available. This will clash with Pytorch, which is also managing multiple processes. If you change it to numThreads=1, it will work.

So go to e.g. ~/miniconda3/envs/<envname>/lib/python3.7/site-packages/staintools/miscelanneous/get_contractions.py , and change spams.lasso(X=OD.T, D=stain_matrix.T, mode=2, lambda1=regularizer, pos=True).toarray().T to spams.lasso(X=OD.T, D=stain_matrix.T, mode=2, lambda1=regularizer, pos=True, numThreads=1).toarray().T.

It's probably safest to make a new file in your repo that overwrites this file. Let me know if you manage to solve it with this, it works for me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants