Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large performance drop if trained with fp32. #96

Open
klauscc opened this issue Jan 16, 2023 · 1 comment
Open

Large performance drop if trained with fp32. #96

klauscc opened this issue Jan 16, 2023 · 1 comment

Comments

@klauscc
Copy link

klauscc commented Jan 16, 2023

Hi authors,
Thanks for your great work!
In the file module_clip @ L557:

convert_weights(model)
model.load_state_dict(state_dict)
return model.eval()

If I remove convert_weight, the model can only achieve an accuracy of ~40%. I can achieve ~43% if convert_weight is kept.
Do you know why is this happened and is there any solution to train without convert_weight but achieve ~43%? Thanks a lot!

The reason that I want to remove convert_weights is because there are some issue with it when I am doing post-pretraining on millions of videos using CLIP. With convert_weights, the loss will become to nan at some point of training. However, if I train with FP32 or AMP there is no such issue. Training with FP32 or AMP will lead to 3% lower accuracy than FP16 (convert_weight).

@sweet132
Copy link

sweet132 commented Jun 9, 2023

meanP

meanP.txt

seqTransf
Transf.txt

Sorry to bother you, I have run the code directly, but the loss is NaN since some wrong videos(the solution is to set the video to 0 in the provided code).
If I change the code about video process, I don't know why I can only get 42.3% for meanP and 43.9% for seqTransf.

How can you get 43%? Have you modified the code for data processing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants