Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA shared memory registration failed when requesting recognition from deepstream to an external triton server. to occur #528

Open
yoo-wonjun opened this issue Apr 12, 2024 · 0 comments

Comments

@yoo-wonjun
Copy link

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) : GPU
• DeepStream Version : 6.1
• JetPack Version (valid for Jetson only)
• TensorRT Version : 8.4.0.11
• NVIDIA GPU Driver Version (valid for GPU only) : 525.105.17
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

The execution environment is as follows

triton server version is 23.10
Deepstream sends an inference request to the triton server docker run separately.
For deepstream’s config.pbtxt, set enable_cuda_buffer_sharing:true
When deepstream makes one inference request to one GPU, it executes normally.
When multiple deep stream inference requests are made in deep stream, an error like number 5 occurs, but over time, it stabilizes and multiple deep streams run normally.
ERROR: infer_grpc_client.cpp:223 Failed to register CUDA shared memory.
ERROR: infer_grpc_client.cpp:311 Failed to set inference input: failed to register CUDA shared memory region ‘inbuf_0x2be8300’: failed to open CUDA IPC handle: invalid argument
ERROR: infer_grpc_backend.cpp:140 gRPC backend run failed to create request for model: yolov8_pose
ERROR: infer_trtis_backend.cpp:350 failed to specify dims when running inference on model:yolov8_pose, nvinfer error:NVDSINFER_TRITON_ERROR
I want to prevent 5 errors when making multiple inference requests.
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I want to prevent 5 errors when making multiple inference requests.
I can see from the document that enable_cuda_buffer_sharing:true is valid on the triton server in the deepstream docker container, but I confirmed that it operates normally over time even when running it on an external triton server. Please tell me how to prevent the above error from occurring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant