Skip to content

Latest commit

 

History

History
 
 

deit

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

DeiT

PaddlePaddle reimplementation of facebookresearch's repository for the deit model that was released with the paper Training data-efficient image transformers & distillation through attention. We only reimplementation training for ViT, excluding distillation.

Requirements

To enjoy some new features, PaddlePaddle 2.4 is required. For more installation tutorials refer to installation.md

How to Train

# Note: Set the following environment variables 
# and then need to run the script on each node.
export PADDLE_NNODES=2
export PADDLE_MASTER="xxx.xxx.xxx.xxx:12538"
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

python -m paddle.distributed.launch \
    --nnodes=$PADDLE_NNODES \
    --master=$PADDLE_MASTER \
    --devices=$CUDA_VISIBLE_DEVICES \
    plsc-train \
    -c ./configs/DeiT_base_patch16_224_in1k_2n16c_dp_fp16o2.yaml

How to Evaluation

# [Optional] Download checkpoint
mkdir -p pretrained/
wget -O ./pretrained/DeiT_base_patch16_224_in1k_2n16c_dp_fp16o2.pdparams https://plsc.bj.bcebos.com/models/deit/v2.4/DeiT_base_patch16_224_in1k_2n16c_dp_fp16o2.pdparams
export PADDLE_NNODES=1
export PADDLE_MASTER="127.0.0.1:12538"
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch \
  --nnodes=$PADDLE_NNODES \
  --master=$PADDLE_MASTER \
  --devices=$CUDA_VISIBLE_DEVICES \
  plsc-eval \
  -c ./configs/DeiT_base_patch16_224_in1k_2n16c_dp_fp16o2.yaml \
  -o Global.pretrained_model=pretrained/DeiT_base_patch16_224_in1k_2n16c_dp_fp16o2 \
  -o Global.finetune=False

Other Configurations

We provide more directly runnable configurations, see DeiT Configurations.

Models

Model DType Phase Dataset Configs GPUs Img/sec Top1 Acc Pre-trained checkpoint Fine-tuned checkpoint Log
ViT-B_16_224 FP32 pretrain ImageNet2012 config A100*N1C8 2780 0.81870 download log
ViT-B_16_224 FP16_O2 pretrain ImageNet2012 config A100*N1C8 3169 0.82098 download log
ViT-B_16_224 FP16_O2 pretrain ImageNet2012 config A100*N2C16 6514 0.81831 download - log

Citations

@InProceedings{pmlr-v139-touvron21a,
  title =     {Training data-efficient image transformers & distillation through attention},
  author =    {Touvron, Hugo and Cord, Matthieu and Douze, Matthijs and Massa, Francisco and Sablayrolles, Alexandre and Jegou, Herve},
  booktitle = {International Conference on Machine Learning},
  pages =     {10347--10357},
  year =      {2021},
  volume =    {139},
  month =     {July}
}