Runtime Error: Segmentation fault #10

TriLoo · 2020-04-16T14:19:00Z

I use my own image datas with demo.py and got this error, no any other infos display.

I have located the postion causing this error, its roi layer calling. However, I tested the ROIAlign_cuda.cu not using PyTorch Tensor as parameters but use float * instead and no errors raise.

my gcc version is 4.8.5, is the gcc version critical ? any advices? thanks

The text was updated successfully, but these errors were encountered:

xyang35 · 2020-04-17T03:20:14Z

Thanks for your interest in our work! How many GPUs are you using for your job? Have you tried using 1 GPU?

TriLoo · 2020-04-17T03:48:37Z

My server contains 8 P40 GPUs. I tried just using one GPU (cuda:0) and same error happened.

I used

os.environ['CUDA_VISIBLE_DEVICE']="0"

# OR

torch.cuda.set_device(0)

to use cuda:0 only.

Also, I manually set the gpu_count to 0.

The top several calling stacks stored in core file is shown as below:

#0  0x00007f6df6eb13ac in construct<_object*, _object*> (__p=0xb, this=0x7f6e4ab5c318) at /usr/include/c++/4.8.2/ext/new_allocator.h:120
#1  _S_construct<_object*, _object*> (__p=0xb, __a=...) at /usr/include/c++/4.8.2/bits/alloc_traits.h:254
#2  construct<_object*, _object*> (__p=0xb, __a=...) at /usr/include/c++/4.8.2/bits/alloc_traits.h:393
#3  emplace_back<_object*> (this=0x7f6e4ab5c318) at /usr/include/c++/4.8.2/bits/vector.tcc:96
#4  push_back (__x=<unknown type in /search/odin/songminghui/githubs/STEP/external/maskrcnn_benchmark/roi_layers/_C.cpython-36m-x86_64-linux-gnu.so, CU 0x0, DIE 0x12877a>, this=0x7f6e4ab5c318) at /usr/include/c++/4.8.2/bits/stl_vector.h:920
#5  loader_life_support (this=0x7ffd998f01f0) at /search/odin/songminghui/anaconda3/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:44
#6  pybind11::cpp_function::dispatcher (self=<optimized out>, args_in=0x7f6deee3be28, kwargs_in=0x0) at /search/odin/songminghui/anaconda3/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:618

quanh1990 · 2020-04-27T11:48:01Z

@TriLoo Hello, I met the same problem, I used one TeslaP100 GPU with 16G memory, did you solve the problem? Looking forward to your reply!

TriLoo · 2020-04-27T11:57:57Z

Sorry, not yet ...
It may be caused by the pytorch version, gcc version, but I am not sure.

By the way, my pytorch version is 1.0.2

quanh1990 · 2020-04-27T23:42:03Z

Thanks for your interest in our work! How many GPUs are you using for your job? Have you tried using 1 GPU?

@xyang35 Could you please provide your version of gcc? Thanks a lot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime Error: Segmentation fault #10

Runtime Error: Segmentation fault #10

TriLoo commented Apr 16, 2020

xyang35 commented Apr 17, 2020

TriLoo commented Apr 17, 2020

quanh1990 commented Apr 27, 2020

TriLoo commented Apr 27, 2020

quanh1990 commented Apr 27, 2020

Runtime Error: Segmentation fault #10

Runtime Error: Segmentation fault #10

Comments

TriLoo commented Apr 16, 2020

xyang35 commented Apr 17, 2020

TriLoo commented Apr 17, 2020

quanh1990 commented Apr 27, 2020

TriLoo commented Apr 27, 2020

quanh1990 commented Apr 27, 2020