Training error in tubedetr.py file. #4

OliverHxh · 2022-05-16T18:11:35Z

I try to train the network on HC-STVGv2 dataset using the command provided in the README.md file:

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --ema \                                                                                       
  2 --load=pretrained_resnet101_checkpoint.pth --combine_datasets=hcstvg --combine_datasets_val=hcstvg \                                                                  
  3 --v2 --dataset_config config/hcstvg.json --epochs=20 --output-dir=output --batch_size=8

Unfortunately, I encountered this issue in models/tubedetr.py line 180

  File "/root/paddlejob/workspace/STVG/TubeDETR/models/tubedetr.py", line 180, in forward                                                                                 
    tpad_src = tpad_src.view(b * n_clips, f, h, w)                                                                                                                        
RuntimeError: shape '[160, 256, 7, 12]' is invalid for input of size 2817024

. Besides, the durations of the eight samples are: [100, 100, 69, 100, 65, 86, 100, 100].

I think this problem is probably related to the padding approach. Do you have any clue with this BUG and how to fix it? Thank you very much!

The text was updated successfully, but these errors were encountered:

antoyang · 2022-05-20T08:06:51Z

All experiments I did were with a batch size of 1 video per GPU given that it already takes quite a bit of GPU memory with long videos / high resolution, so there might be some padding to fix indeed.

Glupapa · 2022-08-17T17:20:45Z

Hi, I encountered the same issue.
Did you fix it?

hyundodo · 2023-04-06T15:27:19Z

Hi, I want to increase batch size, too.
Did you fix it??

AKASH2907 · 2023-05-09T05:37:04Z

Hi, Was anybody able to solve this issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training error in tubedetr.py file. #4

Training error in tubedetr.py file. #4

OliverHxh commented May 16, 2022 •

edited

Loading

antoyang commented May 20, 2022

Glupapa commented Aug 17, 2022

hyundodo commented Apr 6, 2023

AKASH2907 commented May 9, 2023

Training error in tubedetr.py file. #4

Training error in tubedetr.py file. #4

Comments

OliverHxh commented May 16, 2022 • edited Loading

antoyang commented May 20, 2022

Glupapa commented Aug 17, 2022

hyundodo commented Apr 6, 2023

AKASH2907 commented May 9, 2023

OliverHxh commented May 16, 2022 •

edited

Loading