GitHub - SrinjaySarkar/ViT: Pytorch Implementation of Vision Transformer Paper by Google.

Pytorch Implementation of Vision Transformer . Based on the paper:

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
arXiv:2010.11929

Visualization of attention maps of correctly classified samples can be found in the _visualizations folder.

Imagenet-1k Sample

Predicted Label : Vulture

Real Label : Vulture

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Layer 6

Layer 7

Layer 8

Layer 9

Layer 10

Layer 11

Layer 12

Usage

The entire code is self contained in the Jupyter notebook,just run the cells sequentially. It is made this way for ease of training on Google Colab.

To-Do/Coming Soon:

Finetune on CIFAR-10, CIFAR100 and plot visualizations.
Implement the hybrid approach based on Resnet feature Maps.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
visualizations		visualizations
vulture		vulture
README.md		README.md
ViT.ipynb		ViT.ipynb
vit.py		vit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Imagenet-1k Sample

Predicted Label : Vulture

Real Label : Vulture

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Layer 6

Layer 7

Layer 8

Layer 9

Layer 10

Layer 11

Layer 12

Usage

To-Do/Coming Soon:

References

About

Releases

Packages

Languages

SrinjaySarkar/ViT

Folders and files

Latest commit

History

Repository files navigation

Imagenet-1k Sample

Predicted Label : Vulture

Real Label : Vulture

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Layer 6

Layer 7

Layer 8

Layer 9

Layer 10

Layer 11

Layer 12

Usage

To-Do/Coming Soon:

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages