Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ITM loss #126

Open
MLAlex1 opened this issue May 12, 2023 · 1 comment
Open

ITM loss #126

MLAlex1 opened this issue May 12, 2023 · 1 comment

Comments

@MLAlex1
Copy link

MLAlex1 commented May 12, 2023

Hi, thanks again for this great work!

During pre-training phase, for example taking the VG dataset, we have multiple captions corresponding to the same image. It's not clear to me, when you do ITM loss if the same image with different captions happens to appear multiple times in a batch it will become a hard negative example for it but it is actually a valid description for that image, even if as by implementation it will have a label 0, i.e., not a match. Could you please explain the reasoning here? Should we prevent somehow that the same image appears multiple times in the batch to avoid this issue?

@HWH-2000
Copy link

HWH-2000 commented Aug 4, 2023

My recent work also has the same problem. Because of the overlap of text or images, the model cannot learn the difference from the negative samples, resulting in the loss of ITC and ITM tasks not converging. Have you solved this problem?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants