Name		Name	Last commit message	Last commit date
parent directory ..
LICENSE		LICENSE
README.md		README.md
input.jpg		input.jpg
output.png		output.png
vit.py		vit.py
vit_labels.py		vit_labels.py

README.md

Vision Transformer

input

(from https://pixabay.com/photos/labrador-retriever-dog-pet-labrador-6244939/)

output

usage

Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.

For the sample image,

$ python vit.py
(ex on CPU)  $ python vit.py -e 0
(ex on BLAS) $ python vit.py -e 1
(ex on GPU)  $ python vit.py -e 2

If you want to specify the input image, put the image path after the --input option.
You can use --savepath option to change the name of the output file to save.

$ python3 vit.py --input IMAGE_PATH --savepath SAVE_IMAGE_PATH
$ python3 vit.py -i IMAGE_PATH -s SAVE_IMAGE_PATH

By adding the --video option, you can input the video.

$ python3 vit.py --video VIDEO_PATH --savepath SAVE_VIDEO_PATH
$ python3 vit.py -v VIDEO_PATH -s SAVE_VIDEO_PATH
(ex) $ python3 vit.py --video input.mp4 --savepath output.mp4

Reference

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Framework

Pytorch

Model Format

ONNX opset = 10

Netron

ViT-B_16-224.onnx.prototxt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vit

vit

LICENSE

LICENSE

README.md

README.md

input.jpg

input.jpg

output.png

output.png

vit.py

vit.py

vit_labels.py

vit_labels.py

README.md

Vision Transformer

input

output

usage

Reference

Framework

Model Format

Netron

Files

vit

Directory actions

More options

Directory actions

More options

Latest commit

History

vit

Folders and files

parent directory

Vision Transformer

input

output

usage

Reference

Framework

Model Format

Netron