Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime: fails to run with CUDA execution provider #310860

Open
anpin opened this issue May 11, 2024 · 0 comments
Open

onnxruntime: fails to run with CUDA execution provider #310860

anpin opened this issue May 11, 2024 · 0 comments

Comments

@anpin
Copy link
Contributor

anpin commented May 11, 2024

Describe the bug

onnxruntime using CUDA is packaged with both onnxruntime_USE_CUDA and onnxruntime_DISABLE_CONTRIB_OPS which effectively disables CUDA and leads to following error on runtime

> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx
2024-05-11 10:27:29.056205252 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
  File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
    exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
    sess = onnxrt.InferenceSession(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14

> onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx
2024-05-11 10:27:46.458111088 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:2103 CreateInferencePybindStateModule] Init provider bridge failed.
Traceback (most recent call last):
  File "/nix/store/3wc2a16gdvms53vgr2jp9f8z2mv55dkw-python3.11-onnxruntime-1.17.3/bin/.onnxruntime_test-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 159, in main
    exit_code, _, _ = run_model(args.model_path, args.num_iters, args.debug, args.profile, args.symbolic_dims)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/tools/onnxruntime_test.py", line 88, in run_model
    sess = onnxrt.InferenceSession(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/nix/store/igdsm7xzsfsbyjfhrvgw23xsxj21fgln-python3-3.11.9-env/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from Phi-3-mini-128k-instruct-onnx/cuda/cuda-fp16/phi3-mini-128k-instruct-cuda-fp16.onnx failed:This is an invalid model. In Node, ("/model/layers.0/input_layernorm/LayerNorm", SimplifiedLayerNormalization, "", -1) : ("/model/embed_tokens/Gather/output_0": tensor(float16),"model.layers.0.input_layernorm.weight": tensor(float16),) -> ("/model/layers.0/input_layernorm/output_0": tensor(float16),) , Error No Op registered for SimplifiedLayerNormalization with domain_version of 14

Related issue mainstream microsoft/onnxruntime#20658

Steps To Reproduce

Steps to reproduce the behavior:

  1. Enter shell with onnxruntime and CUDA enabled (sorry, but I'm not sure how to run one-liner nix-shell command with CUDA enabled)
    In your config or overlay
config = { allowUnfree = true; cudaSupport = true; };

shell.nix

mkShell {
  packages = [
      onnxruntime
      (pkgs.python3.withPackages (python-pkgs: [
        python-pkgs.huggingface-hub
        python-pkgs.numpy
        python-pkgs.onnxruntime
        # genai is not packaged, but available as PR 
        # python-pkgs.onnxruntime-genai
      ]))
  ];
}
  1. Get the model
huggingface-cli download microsoft/Phi-3-mini-128k-instruct-onnx --include "cuda/cuda-int4-rtn-block-32/*" --local-dir Phi-3-mini-128k-instruct-onnx
  1. Run test
onnxruntime_test Phi-3-mini-128k-instruct-onnx/cuda/cuda-int4-rtn-block-32/phi3-mini-128k-instruct-cuda-int4-rtn-block-32.onnx

Expected behavior

onnxruntime starts executing on GPU

Additional context

Removing onnxruntime_DISABLE_CONTRIB_OPS allows the model to start

diff --git a/pkgs/development/libraries/onnxruntime/default.nix b/pkgs/development/libraries/onnxruntime/default.nix
index 85e2c70ba408..2e09b541f1a7 100644
--- a/pkgs/development/libraries/onnxruntime/default.nix
+++ b/pkgs/development/libraries/onnxruntime/default.nix
@@ -187,7 +187,7 @@ effectiveStdenv.mkDerivation rec {
     "-D_SILENCE_ALL_CXX23_DEPRECATION_WARNINGS=1"
     (lib.cmakeBool "onnxruntime_USE_CUDA" cudaSupport)
     (lib.cmakeBool "onnxruntime_USE_NCCL" cudaSupport)
-    (lib.cmakeBool "onnxruntime_DISABLE_CONTRIB_OPS" cudaSupport)
+    # (lib.cmakeBool "onnxruntime_DISABLE_CONTRIB_OPS" cudaSupport)
   ] ++ lib.optionals pythonSupport [
     "-Donnxruntime_ENABLE_PYTHON=ON"
   ] ++ lib.optionals cudaSupport [

Notify maintainers

@jonringer @puffnfresh @ck3d @cbourjau @wexder

Metadata

 - system: `"x86_64-linux"`
 - host os: `Linux 6.8.9, NixOS, 24.05 (Uakari), 24.05.20240508.8892ecd`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - channels(a): `"nixpkgs"`
 - channels(root): `"nixos"`
 - nixpkgs: `/home/a/.nix-defexpr/channels/nixpkgs`

Add a 👍 reaction to issues you find important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant