Why is Cosine Similarity Scaled for Zero-Shot Image Classifcation? #763

rsorbello · 2023-12-14T19:54:21Z

rsorbello
Dec 14, 2023

Hi All,

I have a simple question based on the zero-shot image classification provided in the README. Why is the Cosine Similarity multiplied by 100? Is to reverse the normalization done in the previous steps? The line I am referring to is pasted below. Thank you very much in advance!

text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)

Answered by gabrielilharco

Dec 21, 2023

Hi @rsorbello. The 100 comes from the learnable logit scaling parameter used in the original paper, which they clip at 100. In general many models have a logit_scale which is learned during training and used to scale logits

View full answer

gabrielilharco · 2023-12-21T20:52:38Z

gabrielilharco
Dec 21, 2023
Maintainer

Hi @rsorbello. The 100 comes from the learnable logit scaling parameter used in the original paper, which they clip at 100. In general many models have a logit_scale which is learned during training and used to scale logits

2 replies

rsorbello Jan 2, 2024
Author

Thank you @gabrielilharco!

rwightman Jan 6, 2024
Maintainer

It actually doesn't matter for cosine similarity though, but does for probability calibration, although I guess should technically use the model's actual logit scale, it just tends to be at or very near 100 at the end of training....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is Cosine Similarity Scaled for Zero-Shot Image Classifcation? #763

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Why is Cosine Similarity Scaled for Zero-Shot Image Classifcation? #763

rsorbello Dec 14, 2023

Replies: 1 comment · 2 replies

gabrielilharco Dec 21, 2023 Maintainer

rsorbello Jan 2, 2024 Author

rwightman Jan 6, 2024 Maintainer

rsorbello
Dec 14, 2023

Replies: 1 comment 2 replies

gabrielilharco
Dec 21, 2023
Maintainer

rsorbello Jan 2, 2024
Author

rwightman Jan 6, 2024
Maintainer