Large performance drop if trained with fp32. #96

klauscc · 2023-01-16T23:07:32Z

Hi authors,
Thanks for your great work!
In the file module_clip @ L557:

convert_weights(model)
model.load_state_dict(state_dict)
return model.eval()

If I remove convert_weight, the model can only achieve an accuracy of ~40%. I can achieve ~43% if convert_weight is kept.
Do you know why is this happened and is there any solution to train without convert_weight but achieve ~43%? Thanks a lot!

The reason that I want to remove convert_weights is because there are some issue with it when I am doing post-pretraining on millions of videos using CLIP. With convert_weights, the loss will become to nan at some point of training. However, if I train with FP32 or AMP there is no such issue. Training with FP32 or AMP will lead to 3% lower accuracy than FP16 (convert_weight).

The text was updated successfully, but these errors were encountered:

sweet132 · 2023-06-09T11:46:45Z

meanP

meanP.txt

seqTransf
Transf.txt

Sorry to bother you, I have run the code directly, but the loss is NaN since some wrong videos(the solution is to set the video to 0 in the provided code).
If I change the code about video process, I don't know why I can only get 42.3% for meanP and 43.9% for seqTransf.

How can you get 43%? Have you modified the code for data processing?

klauscc mentioned this issue Mar 13, 2023

Question about the Temporal model klauscc/VindLU#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large performance drop if trained with fp32. #96

Large performance drop if trained with fp32. #96

klauscc commented Jan 16, 2023

sweet132 commented Jun 9, 2023

Large performance drop if trained with fp32. #96

Large performance drop if trained with fp32. #96

Comments

klauscc commented Jan 16, 2023

sweet132 commented Jun 9, 2023