Py-Torch Implementation of SimSiam

The main idea behind this paper is that Siamese networks can learn meaningful representations even using none of the following:

(i) negative sample pairs
(ii) large batches
(iii) momentum encoders.

This paper emperically proves that the siamese architecture is a great inductive prior that complements the prior of data augmentation (spatial invarience).

Architecture

Two augmented views of one image are processed by the same encoder network f (a backbone plus a projection MLP). Then a prediction MLP h is applied on one side, and a stop-gradient operation is applied on the other side. The model maximizes the similarity between both sides.

Loss

Cosine similarity is used to calculate the loss between the two view obtained from the same image.

- F.cosine_similarity(p, z.detach(), dim=-1).mean()

Batch Size

Simsiam doesn't require bigger batch sizes. However, in my experiments, I found that bigger batch sizes had a faster conversion time as compared to the smaller batch sizes. I used 128, 256, and 512 as my batch size. All of them performed around same accuracy but 512 converged faster than 128.

Optimizer

SGD with momentum = 0.95 and weight_decay=0.0005

Performance

Here's the loss on CIFAR10

Color	Batch_size	Loss	epochs
green	512	-0.906	50
pink	256	-0.901	50
orange	512	-0.919	150

LR Scheduling

TODO:

Design script for training downstream tasks.
Benchmark on STL-10

Reference: https://arxiv.org/abs/2011.10566

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md
dataset_loader.py		dataset_loader.py
knn_monitor.py		knn_monitor.py
lars.py		lars.py
logger.py		logger.py
loss_simsiam.png		loss_simsiam.png
lr_scheduler.py		lr_scheduler.py
lr_simsiam.png		lr_simsiam.png
model.py		model.py
simsiam_arch.png		simsiam_arch.png
simsiam_loss.png		simsiam_loss.png
train_base.py		train_base.py

License

nikheelpandey/SimSiam-PyTorch

Folders and files

Latest commit

History

Repository files navigation

Py-Torch Implementation of SimSiam

Architecture

Loss

Batch Size

Optimizer

Performance

TODO:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages