Name		Name	Last commit message	Last commit date
parent directory ..
models		models
README.md		README.md
atss_swint_fpn_fp16_4x4_adamw_1x_coco.py		atss_swint_fpn_fp16_4x4_adamw_1x_coco.py
cascade_mask_rcnn_swin_base_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py		cascade_mask_rcnn_swin_base_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py
cascade_mask_rcnn_swin_small_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py		cascade_mask_rcnn_swin_small_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py
cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_1x_coco.py		cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_1x_coco.py
cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py		cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py
mask_rcnn_swin_small_patch4_window7_mstrain_480-800_adamw_3x_coco.py		mask_rcnn_swin_small_patch4_window7_mstrain_480-800_adamw_3x_coco.py
mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py		mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py
mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py		mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py

README.md

Swin Transformer for Object Detection

Introduction

This directory contains the configs and results of Swin Transformer. Most configs and results are based on the official repository.

Please consider using the mmdet's configs when you train new models.

Results and Models

ATSS

Backbone	Pretrain	Lr schd	box AP	config	model
Swin-T	ImageNet-1K	1x	43.7	config	github

Mask R-CNN

Backbone	Pretrain	Lr schd	box AP	mask AP	#params	FLOPs	config	log	model
Swin-T	ImageNet-1K	1x	43.7	39.8	48M	267G	config	github/baidu	github/baidu
Swin-T	ImageNet-1K	3x	46.0	41.6	48M	267G	config	github/baidu	github/baidu
Swin-S	ImageNet-1K	3x	48.5	43.3	69M	359G	config	github/baidu	github/baidu

Cascade Mask R-CNN

Backbone	Pretrain	Lr schd	box AP	mask AP	#params	FLOPs	config	log	model
Swin-T	ImageNet-1K	1x	48.1	41.7	86M	745G	config	github/baidu	github/baidu
Swin-T	ImageNet-1K	3x	50.4	43.7	86M	745G	config	github/baidu	github/baidu
Swin-S	ImageNet-1K	3x	51.9	45.0	107M	838G	config	github/baidu	github/baidu
Swin-B	ImageNet-1K	3x	51.9	45.0	145M	982G	config	github/baidu	github/baidu

Notes:

Pre-trained models can be downloaded from Swin Transformer for ImageNet Classification.
Access code for baidu is swin.

Usage

Inference

# single-gpu testing
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval bbox segm

# multi-gpu testing
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval bbox segm

Training

To train a detector with pre-trained models, run:

# single-gpu training
python tools/train.py <CONFIG_FILE> --cfg-options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments]

# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> --cfg-options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments]

For example, to train a Cascade Mask R-CNN model with a Swin-T backbone and 8 gpus, run:

tools/dist_train.sh configs/swin_original/cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL>

Note: use_checkpoint is used to save GPU memory. Please refer to this page for more details.

Mixed Precision Training

The current configs use mixed precision training via MMCV by default. Please install PyTorch >= 1.6.0 to use torch.cuda.amp.

If you find performance difference from apex (used by the original authors), please raise an issue. Otherwise, we will clean code for apex.

Click me to use apex

To install apex, run:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Modify configs with the following code:

runner = dict(type='EpochBasedRunnerAmp', max_epochs=36)
fp16 = None
optimizer_config = dict(
    type='ApexOptimizerHook',
    update_interval=1,
    grad_clip=None,
    coalesce=True,
    bucket_size_mb=-1,
    use_fp16=True,
)

Citation

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swin_original

swin_original

README.md

Swin Transformer for Object Detection

Introduction

Results and Models

ATSS

Mask R-CNN

Cascade Mask R-CNN

Usage

Inference

Training

Mixed Precision Training

Citation

Other Links

Files

swin_original

Directory actions

More options

Directory actions

More options

Latest commit

History

swin_original

Folders and files

parent directory

README.md

Swin Transformer for Object Detection

Introduction

Results and Models

ATSS

Mask R-CNN

Cascade Mask R-CNN

Usage

Inference

Training

Mixed Precision Training

Citation

Other Links