Skip to content

An implementation from scratch in PyTorch of the vision transformer architecture from the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale".

Notifications You must be signed in to change notification settings

eyess-glitch/Vision-transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Vision-transformer

An implementation from scratch in PyTorch of the vision transformer architecture from the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale". The architecture was then trained (for testing purposes) on the following dataset[https://www.kaggle.com/datasets/mahmoudreda55/satellite-image-classification]. In order to make up for the small dataset, DataAugmentation (more specifically the AugMix algorithm) was used in the training phase.

About

An implementation from scratch in PyTorch of the vision transformer architecture from the paper "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale".

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published