Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] FocusDetr report min size error #308

Open
icicle4 opened this issue Sep 22, 2023 · 4 comments
Open

[Bug] FocusDetr report min size error #308

icicle4 opened this issue Sep 22, 2023 · 4 comments

Comments

@icicle4
Copy link

icicle4 commented Sep 22, 2023

When run
python tools/train_net.py --config-file projects/focus_detr/configs/focus_detr_resnet/focus_detr_r101_4scale_24ep.py --num-gpus 8
where train.init_checkpoint = detectron2://ImageNetPretrained/torchvision/R-50.pkl.

It report below error, my dataset is default coco dataset.

Traceback (most recent call last):
  File "tools/train_net.py", line 313, in <module>
    args=(args,),
  File "/root/detrex/detectron2/detectron2/engine/launch.py", line 79, in launch
    daemon=False,
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 4 terminated with the following error:
Traceback (most recent call last):
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/root/detrex/detectron2/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/root/detrex/tools/train_net.py", line 302, in main
    do_train(args, cfg)
  File "/root/detrex/tools/train_net.py", line 275, in do_train
    trainer.train(start_iter, cfg.train.max_iter)
  File "/root/detrex/detectron2/detectron2/engine/train_loop.py", line 149, in train
    self.run_step()
  File "/root/detrex/tools/train_net.py", line 101, in run_step
    loss_dict = self.model(data)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/focus_detr.py", line 269, in forward
    loss_dict = self.criterion(output, targets, dn_meta)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/dn_criterion.py", line 43, in forward
    losses = super(FOCUS_DETRCriterion, self).forward(outputs, targets)
  File "/root/detrex/projects/focus_detr/modeling/two_stage_criterion.py", line 87, in forward
    class_targets = self.target_layer(outputs['srcs'], batch_boxes, batch_classes)
  File "/root/miniconda3/envs/detrex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/root/detrex/projects/focus_detr/modeling/foreground_supervision.py", line 75, in forward
    self.limit_range[level])
  File "/root/detrex/projects/focus_detr/modeling/foreground_supervision.py", line 129, in _gen_level_targets
    areas_min_ind = torch.min(areas, dim=-1)[1]  # [batch_size,h*w]
IndexError: min(): Expected reduction dim 2 to have non-zero size.
@baojunqi
Copy link

same problem bro. Have u addressed this problem?

@emotionee
Copy link

emotionee commented Dec 28, 2023

我也遇到了这个问题,好像是batch_sz 的数量设置有问题,请问找到解决办法了么?
I have also encountered this problem. It seems that there is an issue with the quantity setting of the batch. Have you found a solution?

@baojunqi
Copy link

我也遇到了这个问题,好像是batch_sz 的数量设置有问题,请问找到解决办法了么? I have also encountered this problem. It seems that there is an issue with the quantity setting of the batch. Have you found a solution?

How big is your batch size? I tried to train my own dataset on DETR with a batch size of 16, it works well. However, when I tried to train Focus-DETR with a batch size of 8 on 4 A4000, it failed.

@SmalWhite
Copy link

have you solved the problem, I meet the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants