-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force rampup_batch_size=None in config #83
Comments
I would do it at the config level rather than in the code, so that it's still possible to set it manually through the config if someone ever needs it. |
Doing it at the config level would be a better option if possible. The proposal is something like
This is not too satisfying because it might cause some confusions. |
I would just set |
Is your feature request related to a problem? Please describe.
When the model config has
rampup_batch_size
, we will have model loading errors when the global_batch_size is not set accordingly. Sincerampup_batch_size
is only used in pretraining, not in alignment. We should force it to be None when loading the model.Describe the solution you'd like
gpt_cfg.rampup_batch_size=None
The text was updated successfully, but these errors were encountered: