GitHub - iduta/pyconv: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition (https://arxiv.org/pdf/2006.11538.pdf)

Pyramidal Convolution

This is the PyTorch implementation of our paper "Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition". (Note that this is the code for image recognition on ImageNet. For semantic image segmentation/parsing refer to this repository: https://github.com/iduta/pyconvsegnet)

The models trained on ImageNet can be found here.

PyConv is able to provide improved recognition capabilities over the baseline (see the paper for details).

The accuracy on ImageNet (using the default training settings):

Network	50-layers	101-layers	152-layers
ResNet	76.12% (model)	78.00% (model)	78.45% (model)
PyConvHGResNet	78.48% (model)	79.22% (model)	79.36% (model)
PyConvResNet	77.88% (model)	79.01% (model)	79.52% (model)

The accuracy on ImageNet can be significantly improved using more complex training settings (for instance, using additional data augmentation (CutMix), increase bach size to 1024, learning rate of 0.4, cosine scheduler over 300 epochs and use mixed precision to speed-up training):

Network	test crop: 224×224	test crop: 320×320
PyConvResNet-50 (+augment)	79.44	80.59	(model)
PyConvResNet-101 (+augment)	80.58	81.49	(model)

Requirements

Install PyTorch and ImageNet dataset following the official PyTorch ImageNet training code.

A fast alternative (without the need to install PyTorch and other deep learning libraries) is to use NVIDIA-Docker, we used this container image.

Training

To train a model (for instance, PyConvResNet with 50 layers) using DataParallel run main.py; you need also to provide result_path (the directory path where to save the results and logs) and the --data (the path to the ImageNet dataset):

result_path=/your/path/to/save/results/and/logs/
mkdir -p ${result_path}
python main.py \
--data /your/path/to/ImageNet/dataset/ \
--result_path ${result_path} \
--arch pyconvresnet \
--model_depth 50

To train using Multi-processing Distributed Data Parallel Training follow the instructions in the official PyTorch ImageNet training code.

Citation

If you find our work useful, please consider citing:

@article{duta2020pyramidal,
  author  = {Ionut Cosmin Duta and Li Liu and Fan Zhu and Ling Shao},
  title   = {Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition},
  journal = {arXiv preprint arXiv:2006.11538},
  year    = {2020},
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
div		div
models		models
LICENSE.md		LICENSE.md
README.md		README.md
args_file.py		args_file.py
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

div

div

models

models

LICENSE.md

LICENSE.md

README.md

README.md

args_file.py

args_file.py

main.py

main.py

requirements.txt

requirements.txt

utils.py

utils.py

Repository files navigation

Pyramidal Convolution

Requirements

Training

Citation

About

Releases

Packages

Contributors 2

Languages

License

iduta/pyconv

Folders and files

Latest commit

History

Repository files navigation

Pyramidal Convolution

Requirements

Training

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages