Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support RT-DETR #11395

Open
wants to merge 19 commits into
base: dev-3.x
Choose a base branch
from
Open

[Feature] Support RT-DETR #11395

wants to merge 19 commits into from

Conversation

flytocc
Copy link

@flytocc flytocc commented Jan 17, 2024

Motivation

Support RT-DETR as discussed in this issue

Referred to the following repositories for implementation details:

Modification

  1. Added support for RT-DETR with variants (r18vd, r34vd, r50vd, r101vd).

  2. Added support for random sizes and interpolations in BatchSyncRandomResize.

  3. Modified ResNetV1d for depth 18 and 34.

  4. Added a specialized varifocal loss, RTDETRVarifocalLoss.

BC-breaking

When the depth is set to 18 or 34 in ResNetV1d, a downsample with conv_bn is now added to layer1.

Checklist

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMPreTrain.
  • The documentation has been modified accordingly, like docstring or example tutorials.

@flytocc
Copy link
Author

flytocc commented Jan 17, 2024

reproduction

all results trained on 1 gpu (V100) with total batch size 16

r18vd with amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.465
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.639
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.503
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.286
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.501
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.625
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.872

r18vd without amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.466
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.640
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.505
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.289
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.498
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.629
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.689
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.692
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.496
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.733
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.864

r50vd with amp

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.714
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.575
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.351
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.578
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.700
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.722
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.724
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.724
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.549
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.766
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.883

@hhaAndroid
Copy link
Collaborator

@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?

@flytocc
Copy link
Author

flytocc commented Jan 18, 2024

@flytocc Thank you very much. I would like to confirm why a previous pull request (PR) could not align the precision. Was something incorrect there?

There are many differences in detail, and here are the ones I think are more important (r50vd arch for example):

flytocc/rtdetr nijkah/rtdetr
norm_decay_mult=0 default 1
BatchSyncRandomResize RandomChoiceResize
MinIoURandomCrop RandomCrop
init eccoder with pytorch-like uniform Init HybridEncoder with mmcv-like normal

@flytocc
Copy link
Author

flytocc commented Feb 5, 2024

  • The training (w. amp) AP of r50vd arch fluctuates between 52.9 and 53.1.

  • Random interpolations has almost no effect on AP.

@ychensu
Copy link

ychensu commented Feb 20, 2024

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

@flytocc
Copy link
Author

flytocc commented Feb 20, 2024

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

@ychensu
Copy link

ychensu commented Feb 20, 2024

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题
image

@ychensu
Copy link

ychensu commented Feb 20, 2024

  • The training AP of r50vd arch fluctuates between 52.9 and 53.1.
  • Random interpolations has almost no effect on AP.

您好,我试用了下您写的rt-detr,发现在我的数据集中存在不收敛的现象,就是训练到20个epoch左右,map突然变成0,但我在rt-detr官方代码中并没有这个问题,都是用的4卡4batch

目前只测试过COCO数据集。你试着可以检查一下数据增强

我仅保留了resize至640尺寸的数据增强,结果还是不行,map在21个epoch左右就会降,即使我增大了学习率也不行,我用的是文本检测totaltext数据集,仅在您的代码中出现过这个问题 image

打错了,降低学习率或者增大batch还是会存在这个问题

@flytocc
Copy link
Author

flytocc commented Feb 20, 2024

@ychensu 要不你到 flytocc/mmdetection 提一个issue

@mmeendez8
Copy link
Contributor

Is this currently blocked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants