Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
demo		demo
mmcv_custom		mmcv_custom
mmdet_custom/apis		mmdet_custom/apis
README.md		README.md
checkpoint.py		checkpoint.py
dist_test.sh		dist_test.sh
dist_train.sh		dist_train.sh
get_flops.py		get_flops.py
test.py		test.py
train.py		train.py

README.md

Applying MogaNet to Object Detection

This repo is a PyTorch implementation of applying MogaNet to object detaction and instance segmentation with Mask R-CNN and RetinaNet on COCO. The code is based on MMDetection. For more details, see Efficient Multi-order Gated Aggregation Network (ICLR 2024).

Note

Please note that we simply follow the hyper-parameters of PVT and ConvNeXt, which may not be the optimal ones for MogaNet. Feel free to tune the hyper-parameters to get better performance.

Environement Setup

Install MMDetection from souce code, or follow the following steps. This experiment uses MMDetection>=2.19.0, and we reproduced the results with MMDetection v2.26.0 and Pytorch==1.10.

pip install openmim
mim install mmcv-full
pip install mmdet

Apex (optional) for Pytorch<=1.6.0:

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cpp_ext --cuda_ext --user

By default, we run experiments with fp32 or fp16 (Apex). If you would like to disable apex, modify the type of runner as EpochBasedRunner and comment out the following code block in the configuration files:

fp16 = None
optimizer_config = dict(
    type="DistOptimizerHook",
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=True,
)

Note: Since we write MogaNet backbone code of detection, segmentation, and pose estimation in the same file, it also works for MMSegmentation and MMPose through @BACKBONES.register_module(). Please continue to install MMSegmentation or MMPose for further usage.

Data preparation

Download COCO2017 and prepare COCO experiments according to the guidelines in MMDetection.

(back to top)

Results and models on COCO

Notes: All the models can also be downloaded by Baidu Cloud (z8mf) at MogaNet/COCO_Detection. We preform object detection experiments based on RetinaNet for 1x training setting, while performing detection and instance segmentation experiments based on Mask R-CNN and Cascade Mask R-CNN for 1x or MS 3x (multiple scales) training settings. The params (M) and FLOPs (G) are measured by get_flops with 1280 $\times$ 800 resolutions.

python get_flops.py /path/to/config --shape 1280 800

MogaNet + RetinaNet

Method	Backbone	Pretrain	Params	FLOPs	Lr schd	box mAP	Config	Download
RetinaNet	MogaNet-XT	ImageNet-1K	12.1M	167.2G	1x	39.7	config	log / model
RetinaNet	MogaNet-T	ImageNet-1K	14.4M	173.4G	1x	41.4	config	log / model
RetinaNet	MogaNet-S	ImageNet-1K	35.1M	253.0G	1x	45.8	config	log / model
RetinaNet	MogaNet-B	ImageNet-1K	53.5M	354.5G	1x	47.7	config	log / model
RetinaNet	MogaNet-L	ImageNet-1K	92.4M	476.8G	1x	48.7	config	log / model

MogaNet + Mask R-CNN

Method	Backbone	Pretrain	Params	FLOPs	Lr schd	box mAP	mask mAP	Config	Download
Mask R-CNN	MogaNet-XT	ImageNet-1K	22.8M	185.4G	1x	40.7	37.6	config	log / model
Mask R-CNN	MogaNet-T	ImageNet-1K	25.0M	191.7G	1x	42.6	39.1	config	log / model
Mask R-CNN	MogaNet-S	ImageNet-1K	45.0M	271.6G	1x	46.6	42.2	config	log / model
Mask R-CNN	MogaNet-B	ImageNet-1K	63.4M	373.1G	1x	49.0	43.8	config	log / model
Mask R-CNN	MogaNet-L	ImageNet-1K	102.1M	495.3G	1x	49.4	44.2	config	log / model
Mask R-CNN	MogaNet-T	ImageNet-1K	25.0M	191.7G	MS 3x	45.3	40.7	config	log / model
Mask R-CNN	MogaNet-S	ImageNet-1K	45.0M	271.6G	MS 3x	48.5	43.1	config	log / model
Mask R-CNN	MogaNet-B	ImageNet-1K	63.4M	373.1G	MS 3x	50.3	44.4	config	log / model
Mask R-CNN	MogaNet-L	ImageNet-1K	63.4M	373.1G	MS 3x	50.6	44.6	config	log / model

MogaNet + Cascade Mask R-CNN

Method	Backbone	Pretrain	Params	FLOPs	Lr schd	box mAP	mask mAP	Config	Download
Cascade Mask R-CNN	MogaNet-S	ImageNet-1K	77.9M	405.4G	MS 3x	51.4	44.9	config	log / model
Cascade Mask R-CNN	MogaNet-S	ImageNet-1K	82.8M	750.2G	GIOU+MS 3x	51.7	45.1	config	log / model
Cascade Mask R-CNN	MogaNet-B	ImageNet-1K	101.2M	851.6G	GIOU+MS 3x	52.6	46.0	config	log / model
Cascade Mask R-CNN	MogaNet-L	ImageNet-1K	139.9M	973.8G	GIOU+MS 3x	53.3	46.1	config	-

Demo

We provide some demos according to MMDetection. Please use inference_demo or run the following script:

cd demo
python image_demo.py demo.png ../configs/moganet/mask_rcnn_moganet_small_fpn_1x_coco.py ../../work_dirs/checkpoints/mask_rcnn_moganet_small_fpn_1x_coco.pth --out-file pred.png

Training

We train the model on a single node with 8 GPUs (a batch size of 16) by default. Start training with the config as:

PORT=29001 bash dist_train.sh /path/to/config 8

Evaluation

To evaluate the trained model on a single node with 8 GPUs, run:

bash dist_test.sh /path/to/config /path/to/checkpoint 8 --out results.pkl --eval bbox # or `bbox segm`

Citation

If you find this repository helpful, please consider citing:

@inproceedings{iclr2024MogaNet,
  title={Efficient Multi-order Gated Aggregation Network},
  author={Siyuan Li and Zedong Wang and Zicheng Liu and Cheng Tan and Haitao Lin and Di Wu and Zhiyuan Chen and Jiangbin Zheng and Stan Z. Li},
  booktitle={International Conference on Learning Representations},
  year={2024}
}

Acknowledgment

Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

(back to top)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

detection

detection

README.md

Applying MogaNet to Object Detection

Note

Environement Setup

Data preparation

Results and models on COCO

MogaNet + RetinaNet

MogaNet + Mask R-CNN

MogaNet + Cascade Mask R-CNN

Demo

Training

Evaluation

Citation

Acknowledgment

Files

detection

Directory actions

More options

Directory actions

More options

Latest commit

History

detection

Folders and files

parent directory

README.md

Applying MogaNet to Object Detection

Note

Environement Setup

Data preparation

Results and models on COCO

MogaNet + RetinaNet

MogaNet + Mask R-CNN

MogaNet + Cascade Mask R-CNN

Demo

Training

Evaluation

Citation

Acknowledgment