You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user reported a crash with 24.01.01 and SFT (while things work fine with 24.01):
File "/opt/NeMo-Aligner/examples/nlp/gpt/train_gpt_sft.py", line 215, in main
init_using_ptl(trainer, ptl_model, train_dataloader, train_ds)
File "/opt/NeMo-Aligner/nemo_aligner/utils/train_script_utils.py", line 103, in init_using_ptl
call._call_setup_hook(ptl_trainer)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 86, in _call_setup_hook
_call_lightning_module_hook(trainer, "setup", stage=fn)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 145, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/opt/NeMo/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py", line 1372, in setup
self._reconfigure_val_batches()
File "/opt/NeMo/nemo/collections/nlp/models/language_modeling/megatron_base_model.py", line 340, in _reconfigure_val_batches
val_len_in_micro_batches = len(self._validation_dl)
TypeError: object of type 'NoneType' has no len()
The text was updated successfully, but these errors were encountered:
Describe the bug
A user reported a crash with 24.01.01 and SFT (while things work fine with 24.01):
The text was updated successfully, but these errors were encountered: