Force rampup_batch_size=None in config #83

shengyangs · 2024-01-18T18:48:43Z

Is your feature request related to a problem? Please describe.

When the model config has rampup_batch_size, we will have model loading errors when the global_batch_size is not set accordingly. Since rampup_batch_size is only used in pretraining, not in alignment. We should force it to be None when loading the model.

Describe the solution you'd like

gpt_cfg.rampup_batch_size=None

The text was updated successfully, but these errors were encountered:

odelalleau · 2024-01-18T20:13:12Z

I would do it at the config level rather than in the code, so that it's still possible to set it manually through the config if someone ever needs it.

shengyangs · 2024-01-18T20:25:17Z

Doing it at the config level would be a better option if possible.

The proposal is something like

When the user wants to use the rampup_batch_size in the model, model.rampup_batch_size="model".
When the user wants to turn off rampup_batch_size, model.rampup_batch_size="off" or null
When the user wants to specify it, model.rampup_batch_size=[xx, xx, xx]

This is not too satisfying because it might cause some confusions.

odelalleau · 2024-01-18T21:03:46Z

I would just set rampup_batch_size: null in our .yaml and not worry about use case 1 (I see no good reason why someone would want to use the same rampup batch size that was used for pre-training, and if they want to they can just manually copy it with use case 3 -- or use a different config file that doesn't overwrite it)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force rampup_batch_size=None in config #83

Force rampup_batch_size=None in config #83

shengyangs commented Jan 18, 2024

odelalleau commented Jan 18, 2024

shengyangs commented Jan 18, 2024

odelalleau commented Jan 18, 2024 •

edited

Force rampup_batch_size=None in config #83

Force rampup_batch_size=None in config #83

Comments

shengyangs commented Jan 18, 2024

odelalleau commented Jan 18, 2024

shengyangs commented Jan 18, 2024

odelalleau commented Jan 18, 2024 • edited

odelalleau commented Jan 18, 2024 •

edited