Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some improvements to handling of OpenMP on macOS #6489

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

barracuda156
Copy link

Please review.

This is still hackish and makes assumptions which may not hold. However, it is arguably a bit more sane:

  1. While retaining default usage of Homebrew, let a user to disable it.
  2. Do not bake in paths to libomp with GCC, which uses its own libgomp (and which normally does not need specific paths at all).

@borchero
Copy link
Collaborator

@jameslamb we've repeatedly had issues with OpenMP on MacOS, I'm wondering whether we should just advertise a conda-based compilation process. I successfully compiled locally with the following environment:

dependencies:
  - python
  - cxx-compiler
  - llvm-openmp
  - cmake
  - make

and by setting CXXFLAGS="-I${CONDA_PREFIX}/include".

@barracuda156
Copy link
Author

@borchero I think any default which you (upstream) prefer is fine, just do not hardcode it, otherwise everyone else is forced to patch the code to get around that.

Compilation works perfectly fine in MacPorts, for example, but I have to throw away huge chunks from CMakeLists now, because we do not want rpaths, and certainly do not wait brewisms, and even less so hardcoded usage of incompatible libraries.

That solves the problem for us, but it still exists elsewhere, since some thirdparty software borrow pre-built LightGBM, and that has hardcoded paths to Homebrew prefix, which of course cannot work in any other setup: ankane/lightgbm-ruby#7

@jameslamb
Copy link
Collaborator

Thanks for your interest in LightGBM.

To start, please... don't come here and say that the current state is not "sane". We can discuss the relative benefits and disadvantages of different approaches without insulting each other.


Compilation works perfectly fine in MacPorts, for example, but I have to throw away huge chunks from CMakeLists now, because we do not want rpaths, and certainly do not wait brewisms, and even less so hardcoded usage of incompatible libraries.

Can you link us to the patches you're using to do that, so we can see specifically what you're referring to?


that has hardcoded paths to Homebrew prefix, which of course cannot work in any other setup: ankane/lightgbm-ruby#7

The hard-coded install name /opt/homebrew/opt/libomp/lib/libomp.dylib mentioned in that issue was removed in v4.4.0, thanks to the changes in #6391.


I'm wondering whether we should just advertise a conda-based compilation process.

The project should already support this without any need to add any other headers, and without adding any new conda-specific changes to its CMakeLists. conda's compilers do all that manipulation of includes, linker paths, etc. for you as a part of how they work.

how I tested that (click me)
conda create \
    --name delete-me \
    -c conda-forge \
    --yes \
        python=3.10 \
        cmake \
        cxx-compiler \
        llvm-openmp

source activate delete-me

cmake -B build -S .
cmake --build build --target _lightgbm -j4

That produces a library with the expected path entries.

otool -L lib_lightgbm.dylib
# lib_lightgbm.dylib:
#	@rpath/lib_lightgbm.dylib (compatibility version 0.0.0, current version 0.0.0)
#	@rpath/libomp.dylib (compatibility version 5.0.0, current version 5.0.0)
#	@rpath/libc++.1.dylib (compatibility version 1.0.0, current version 1.0.0)
#	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

And an RPATH entry pointing to where libomp.dylib was found in conda's libraries during compilation.

otool -l lib_lightgbm.dylib
# Load command 15
#          cmd LC_RPATH
#      cmdsize 56
#         path /Users/jlamb/miniforge3/envs/delete-me/lib (offset 12)
# Load command 16
#          cmd LC_RPATH
#      cmdsize 48
#         path /opt/homebrew/opt/libomp/lib (offset 12)

The Python package built in that way works without issue.

source activate delete-me
sh build-python.sh bdist_wheel install
conda install -c conda-forge --yes pandas scikit-learn
python examples/python-guide/sklearn_example.py

If you tried this and observed something different, please tell me.


Do not bake in paths to libomp with GCC, which uses its own libgomp

Can you share an example where you ran into this issue? Because the install name LightGBM uses for the OpenMP it found at build time should be libgomp.dylib when using gcc.

That's what I see (on my M2 Mac).

brew install gcc
export CC=gcc-14
export CXX=g++-14

cmake -B build -S .
build logs showing 'gcc -fopenmp' was used (click me)
-- The C compiler identification is GNU 14.1.0
-- The CXX compiler identification is GNU 14.1.0
-- Checking whether C compiler has -isysroot
-- Checking whether C compiler has -isysroot - yes
-- Checking whether C compiler supports OSX deployment target flag
-- Checking whether C compiler supports OSX deployment target flag - yes
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/homebrew/bin/gcc-14 - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Checking whether CXX compiler has -isysroot
-- Checking whether CXX compiler has -isysroot - yes
-- Checking whether CXX compiler supports OSX deployment target flag
-- Checking whether CXX compiler supports OSX deployment target flag - yes
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/homebrew/bin/g++-14 - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Performing Test MM_PREFETCH
-- Performing Test MM_PREFETCH - Failed
-- Performing Test MM_MALLOC
-- Performing Test MM_MALLOC - Failed
-- Configuring done (2.8s)
-- Generating done (0.1s)
-- Build files have been written to: /Users/jlamb/repos/LightGBM/build
cmake --build build --target _lightgbm -j4

otool showing that libgomp.dylib, not libomp.dylib, was linked.

otool -L lib_lightgbm.dylib
./lib_lightgbm.dylib:
	@rpath/lib_lightgbm.dylib (compatibility version 0.0.0, current version 0.0.0)
	/opt/homebrew/opt/gcc/lib/gcc/current/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.33.0)
	/opt/homebrew/opt/gcc/lib/gcc/current/libgomp.1.dylib (compatibility version 2.0.0, current version 2.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)

It would be a problem for relocation to have that absolute, Homebrew-specific install name for libgomp.dylib in the binary... something not caught here because from this repo all of the macOS binaries we build for redistribution are built with clang. I'd welcome a change to allow this project to produce more relocation-friendly binaries on macOS using gcc... although I'm not sure that the current approach you're proposing in this PR would do that. Like this:

if(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
          INSTALL_RPATH "/opt/homebrew/opt/libomp/lib;/opt/local/lib/libomp;${OpenMP_LIBRARY_DIR}"
      else()
          INSTALL_RPATH "${OpenMP_LIBRARY_DIR}"
      endif()

I suspect OpenMP_LIBRARY_DIR will still be an absolute path at that point, and maybe not a highly portable one if using Homebrew's gcc / g++.

While retaining default usage of Homebrew, let a user to disable it.

I'm not convinced that the case you've described justifies further complicating the API of project's CMakeLists with a new top-level option like USE_BREW. The code I think you're referring to, starting here:

LightGBM/CMakeLists.txt

Lines 164 to 167 in d56a7a3

if(NOT OpenMP_FOUND)
# libomp 15.0+ from brew is keg-only, so have to search in other locations.
# See https://github.com/Homebrew/homebrew-core/issues/112107#issuecomment-1278042927.
execute_process(COMMAND brew --prefix libomp

will run only if find_package(OpenMP) has not found OpenMP by other means. If MacPorts is placing libomp.dylib at a standard path like /usr/local/lib, I'd be surprised to learn that find_package(OpenMP) is not finding it.

Can you describe precisely, in a way that I could reproduce, a case where you saw different behavior?

@barracuda156
Copy link
Author

@jameslamb Thank you for responding in detail.

We can discuss the relative benefits and disadvantages of different approaches without insulting each other.

Absolutely. I apologize if my choice of words created such an impression, it was not intended. (I readily admit that my own code is not sane in some instances.)

Can you share an example where you ran into this issue? Because the install name LightGBM uses for the OpenMP it found at build time should be libgomp.dylib when using gcc.

Yes, you are right, of course. I just did not see any relevant condition on compiler here:

INSTALL_RPATH "/opt/homebrew/opt/libomp/lib;${OpenMP_LIBRARY_DIR}"

So my impression was that this will still be used regardless. Sorry, if I misread the code.

It would be a problem for relocation to have that absolute, Homebrew-specific install name for libgomp.dylib in the binary.

Any specific hardcoded path is problematic for general-case distribution, even without any concern for relocating a binary, since there are no standard paths for non-system libraries. I do not know a solution for a general case, as long as pre-built binaries/libraries are distributed. (Using rpaths is not perfect either, though preferable to baked in paths to any package manager.)
For building from source a general solution is allowing the build to be configurable (while maintaining default behavior which you consider optimal). A downstream distribution using Unix-style installations may not need or want any rpaths being used, since absolute paths are more robust (provided they match the actual environment, of course).

I'm not convinced that the case you've described justifies further complicating the API of project's CMakeLists with a new top-level option like USE_BREW.

To be honest I would rather remove all package-specific code, since it is redundant at best: package managers handle install prefix etc. in their own build systems. But I think someone may get upset, whether now or at some point in a future, if such commit is merged, that somebody from MacPorts removed Homebrew code :) So I do not want to do that.
But I hope Homebrew is capable enough to handle its installations, and in that case this code is indeed unneeded.

However if the approach is to leave defaults as they are (there may be reasons for that which I did not think of, after all), then there is a case to allow disabling certain default behavior. It is conceivable that someone may have Homebrew and Macports both, or have Homebrew but trying to build without relying on a package manager etc.
(I have seen cases when some other software packages tried to download and install some random stuff. It is good if it fails explicitly, worse if it succeeds without one noticing what is going on.)
But yes, I have noticed it is a fallback.

If MacPorts is placing libomp.dylib at a standard path like /usr/local/lib, I'd be surprised to learn that find_package(OpenMP) is not finding it.

MacPorts is placing it in /opt/local/lib/libomp because we do not want it to be found accidentally. But the codebase takes care of finding it when it is needed, so there is no problem in this sense. I rather see a problem in something being found when it should not be (from a point of view of a given user).

Can you link us to the patches you're using to do that, so we can see specifically what you're referring to?

Sure, but this is our local patch, not a proposal for changes (I understand it may not fit the needs/preferences of others).
https://github.com/macports/macports-ports/blob/b9671ddc017ddf902b248ea760e7fd2a05178792/math/LightGBM/files/0001-Fix-CMakeLists.txt.patch
(Though IMO it will be cool to have configure options to use external libraries instread of building and installing duplicates.)

@barracuda156
Copy link
Author

@jameslamb Should I drop the second commit and leave only 26cc564 which should be uncontroversial?

@jameslamb
Copy link
Collaborator

It may be a few days until I'm able to provide a thoughtful answer here, sorry.

The state of these codepaths is very focused on building the library for redistribution (e.g. in Python wheels) and you've brought up some excellent points about how that might make other types of builds more difficult. I need to find a bit of time to think carefully about this.

@barracuda156
Copy link
Author

@jameslamb Sure, thank you. No hurry here.

@jameslamb
Copy link
Collaborator

IMO it will be cool to have configure options to use external libraries instread of building and installing duplicates.

Thanks for this. We intentionally vendor sources of specific, fixed commits of Eigen, fmt, fast_double_parser, and (in some builds) Boost here for stability reasons and because the sources are relatively small. I'd like to preserve that pattern... this project has been struggling for years from a lack of maintainer availability (relative to its size), and I don't want to take on the packaging and maintenance burden of allowing those dependencies to be pluggable.

But if you do want to propose that separately, we'd be happy to talk about it more on a separate issue.

A downstream distribution using Unix-style installations may not need or want any rpaths being used, since absolute paths are more robust

Sure, and this is why conda re-writes all of the paths embedded in the binaries it creates (as one example): https://docs.conda.io/projects/conda-build/en/latest/resources/make-relocatable.html.

if the approach is to leave defaults as they are (there may be reasons for that which I did not think of, after all)

Yes I'd like to preserve the defaults. In short, we want to support the following:

  • distribute pre-compiled binaries for macOS that are compiled with clang + use LLVM OpenMP (libomp)
  • where those are Python wheels (which contain a lib_lightgbm.dylib):
    • if the library is loaded into a process that already has a libomp.dylib loaded (e.g. by some other library), dynamically link to that instead of loading a second, different copy of libomp.dylib
    • if the linker needs to search for libomp.dylib, it should eventually try wherever Homebrew puts libomp.dylib before raising a runtime error

In this project, we are producing shared libraries that are distributed in Python wheels installed with e.g. pip... a package manager that does not have a distribution of OpenMP. And to make things even more fun, we want to support the common case of installing such a wheel into an environment otherwise managed by conda (for, e.g., building one of the many variants of the Python package that we do not publish precompiled binaries for, like pip install --no-binary lightgbm -C cmake.define=-DUSE_CUDA=ON).

Given that, I strongly thing the project should continue to set the install name for its OpenMP dependency to @rpath/lib[go|io|o]mp.dylib.

MacPorts is placing it in /opt/local/lib/libomp because we do not want it to be found accidentally.

Ah sorry, my mistake. Thank you for explaining, that makes sense to me.


Based on my read of the things you've written, and reviewing other OpenMP codepaths in LightGBM's CMake configuration, I'm open to adding a new CMake option as you've proposed.

Here's what I'd like to propose, please let me know what you think:

  1. call this new option USE_HOMEBREW_FALLBACK
    • default ON to preserve the current behavior
    • docstring "(macOS-only) set to OFF to avoid looking in 'brew --prefix' for libraries (e.g. OpenMP)"
  2. change this other compiler condition from STREQUAL "Clang" to MATCHES "Clang" as you have in this PR
  3. change any other mentions of 'libomp' in code comments in CMakeLists.txt that are not LLVM-specific to "OpenMP" or "lib[go|io|o]mp" to make it clearer that they shouldn't be LLVM-specific
  4. put OpenMP_LIBRARY_DIR as the first entry on the list of clang paths added to the RPATH, like this:
if(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
    INSTALL_RPATH "${OpenMP_LIBRARY_DIR};/opt/homebrew/opt/libomp/lib;/opt/local/lib/libomp"
else()
    INSTALL_RPATH "${OpenMP_LIBRARY_DIR}"
endif()

That should ensure that if you use the shared library on the same system where you built it, the location where libomp.dylib was found at build time will be the one that's loaded and runtime.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving a blocking review based on my last comment (meant to post that as a review not a regular comment)

@barracuda156
Copy link
Author

barracuda156 commented Jun 23, 2024

@jameslamb Thank you for reviewing, sounds good to me. I will deal with this tomorrow and rebase the PR.

P. S. As for supporting external dependencies, I do not think as a non-default option it is likely to increase maintenance burden. A notice can be added that such configuration is not tested / not guaranteed to work (or something to this effect). Maybe also use mark_as_advanced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants