-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dipy==1.9.0
is causing our macOS GitHub Actions runners to stall on unrelated torch
tests
#3125
Comments
Hi @joshuacwnewton, I confirm, The difference between 1.8.0 and 1.9.0 on macOS is OPENMP: Not sure what is the issue. So maybe you are encountering concurrency, or openmp conflict, or something like that with pytorch. I will experiment next week |
Thank you for the confirmation! I'll try some experimentation on our end, too. :) |
dipy==1.9.0 is built with OpenMP enabled (dipy/dipy#3125). This is very similar to the issues we had with onnxruntime 1.5 and 1.6, and installing `libomp` in our CI fixed this issue back then, too. If this works, then I will report this upstream to the dipy devs, then we will pin `dipy!=1.9.0`, since we probably don't want to make OpenMP a requirement for our users.
The OpenMP mention reminded me of a similar issue we had with an upstream change to
tl;dr: Going off of what we did previously to fix this (install openmp in CI), I tried this on our test PR to see if it would prevent the runner from timing out. But,
I will keep investigating, but my hunch is that our package will just have to specific |
Description
This is a bit of a weird one. In a routine maintenance PR where we're upgrading the versions of Python packages installed into our virtual environment, we noticed that some tests were timing out for macOS runners. After a litany of debugging, we narrowed it down to
dipy==1.9.0
. (Holding all other packages at their same versions,dipy==1.8.0
passes butdipy==1.9.0
times out.)The tests that stall appear to be ones related to
torch
-based deep leaning pipelines. For example, we have a deep learning-based registration model calledalgo='dl'
that uses the packagesvoxelmorph
/neurite
/pystrum
.pipdeptree output for voxelmorph
However, I've found that our usage of
dipy
is entirely independent of this test. In fact, if I uninstalldipy
locally and comment out ourdipy
imports, the test is still able to be run just fine without error on my local Ubuntu machine. So, I'm at a bit of a loss as to why thedipy
upgrade would cause this test (as well as other non-voxelmorph tests) to time out, and only for ourmacos
GHA runners.I'd be happy to do more debugging + narrow things down to a more reproducible example. But, before I do that, I wanted to report this issue upstream, just in case my general description of the issue rings a bell for a relevant change that dipy may have made between 1.8.0 and 1.9.0.
(Skimming https://github.com/dipy/dipy/releases/tag/1.9.0, the only macOS-related change I see is #3061, but I'm not entirely sure if that's related.)
Way to reproduce
[If reporting a bug, please include the following important information:]
python -c "import platform; print(platform.platform())"
)python -c "import sys; print('Python', sys.version)"
)python -c "import dipy; print(dipy.__version__)"
)Note
To be added by @joshuacwnewton.
The text was updated successfully, but these errors were encountered: