Implementation of a Paper related to Vision Transformer
-
Updated
Mar 4, 2023 - Python
Implementation of a Paper related to Vision Transformer
Image captioning with pretrained encoder on MSCOCO
This is a warehouse for Agent-Attention-Models based on pytorch framework, can be used to train your image datasets.
This is a warehouse for SBCFormer-pytorch-model, can be used to train your dataset
All Assignments of the course, Statistical Methods in AI at IIITH, Monsoon 2024
This repository holds the downstream task of Face Mask Classification performed on Self Currated Custom Dataset with various State of the Art deep learning models like ViT, BeIT, DeIT, LeViT, ConvNeXt, VGG16, EfficientNetV2, RegNet and MobileNetV3.
Image classification with DeiT model, including data preprocessing, k-fold CV, early stopping and model saving.
VisionTransformer for Tensorflow2
Final assignment in the NLP course at the Technion (IEM097215). In this assignment we propose a novel architecture to handle both Text-to-Image translation and Image-to-Text translation tasks on paired data, using a unified architecture of transformers and CNNs and enforcing cycle consistency.
This is a warehouse for DeiT-pytorch-model, can be used to train your image dataset
The analysis of several vision-based transformers is the main emphasis of this project, which also analyzes their distinctive properties and evaluates how well they work using a common dataset. The study intends to obtain insights into the strengths and shortcomings of various transformer designs created for computer vision tasks.
[CVPR 2024] Code for our Paper "DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets"
(Unofficial) PyTorch implementation of Training Vision Transformers for Image Retrieval(El-Nouby, Alaaeldin, et al. 2021).
A PaddlePaddle version image model zoo.
Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
Add a description, image, and links to the deit topic page so that developers can more easily learn about it.
To associate your repository with the deit topic, visit your repo's landing page and select "manage topics."