-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testPackUnpackExternal alignment error on sparc64 #147
Comments
I don't think I made any change in 3.1.2 from 3.1.1 that would explain such failure. Did you notice that old logs correspond to Open MPI 4.1.1, but new failing logs correspond to Open MPI 4.1.2? It is not the first time that an Open MPI patch release breaks mpi4py testsuite. @jeffhammond Can you curse in my behalf? |
Fair point. We don't run CI tests on sparc64, so didn't catch the openmpi regression outside of the new builds. |
@drew-parsons Did you confirm whether this was an Open MPI regression from 4.1.1 to 4.1.2 ? |
Our sparc64 porterbox is down at the moment, so I can only judge by the past build logs at https://buildd.debian.org/status/logs.php?pkg=mpi4py&arch=sparc64 The last successful sparc64 build was mpi4py 3.1.1 with openmpi 4.1.1, After that, mpi4py 3.1.2 and 3.1.3 have been failing with 4.1.2~rc1 and 4.1.2. |
@drew-parsons What should we do with this issue? Did you report the problem upstream? Is the problem still there with the Open MPI 5.0.0rc tarball? Perhaps we disable the pack/unpack external tests if running Open MPI < 5 on sparc64? What's the output of |
Eh, our sparc64 porterbox is still offline. Apparently a new bigger, better one is being commissioned. In the meantime, the Debian porters suggest requesting access to the GCC Compile Farm, https://gcc.gnu.org/wiki/CompileFarm . The Compile Farm is located at https://cfarm.tetaneutral.net/ . Their sparc64 box gcc202 is also down for hardware troubles, but their gcc102 is running fine. "sparc64" is used in the library triplet, so there's a good chance it's what's returned by Debian hasn't got a build of OpenMPI 5 yet (time doesn't permit me to build it separately). I haven't reported the error separately, would it be equal to the problem you raised at open-mpi/ompi#8918 ? |
I'm not sure. Note however that you reported a Bus Error (alignment issues), and that's a bit different than bad binary packing/unpacking in a specific binary representation (external32). |
@drew-parsons Looks like the sparc64 machine is back to life, right? Any chance you can try openmpi from git at branch v5.0.x to see whether the bus error is still there? If the issue is gone, then we can mark as know failure under |
Eh no, the sparc64 porterbox (kyoto.debian.net) is still down. The official buildd is running, but it's not as simple to load up a build onto it as it would be to run a manual build on the porterbox. |
OK, sorry for the confusion. Anyway, why don't you decorate the failing test with |
It's a sensible workaround, I'll do that. |
Confirming this error still occurs on sparc64 with OpenMPI 4.1.6 |
@drew-parsons Any news about Open MPI v5? Did it landed in Debian? |
OpenMPI 5 is now available in experimental, Getting it into debian unstable has been slowed down by Debian's decision to introduce 64-bit We also now have a new sparc64 porterbox we can test on. I'll try to test soon, or we can request access for you if you'd like to inspect directly yourself. |
Still failing with OpenMPI 5.0.3
|
My wild guess is that Open MPI is somewhere implementing pack/unpack with unaligned load/stores rather than |
sparc64 is not the most common architecture around, but for what it's worth 3.1.2 has started giving a Bus Error (Invalid address alignment) in testPackUnpackExternal (test_pack.TestPackExternal),
Full log at https://buildd.debian.org/status/fetch.php?pkg=mpi4py&arch=sparc64&ver=3.1.2-1&stamp=1636215944&raw=0
It previously passed with 3.1.1.
Ongoing sparc64 build logs at https://buildd.debian.org/status/logs.php?pkg=mpi4py&arch=sparc64
The text was updated successfully, but these errors were encountered: