Model TypeError while running Lag_Llama_Fine_Tuning_Demo notebook #57

ilteralp · 2024-05-01T01:49:27Z

Hi all,

I have encountered the error below on while running the following line of "Lag_Llama_Fine_Tuning_Demo.ipynb",

predictor = estimator.train(dataset.train, cache_data=True, shuffle_buffer_length=1000)

Error message:

 TypeError: `model` must be a `LightningModule` or `torch._dynamo.OptimizedModule`, got `LagLlamaLightningModule`

Here are the details:

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 predictor = estimator.train(dataset.train, cache_data=True, shuffle_buffer_length=1000)

File ~\anaconda3\envs\lag_llama\lib\site-packages\gluonts\torch\model\estimator.py:237, in PyTorchLightningEstimator.train(self, training_data, validation_data, shuffle_buffer_length, cache_data, ckpt_path, **kwargs)
    228 def train(
    229     self,
    230     training_data: Dataset,
   (...)
    235     **kwargs,
    236 ) -> PyTorchPredictor:
--> 237     return self.train_model(
    238         training_data,
    239         validation_data,
    240         shuffle_buffer_length=shuffle_buffer_length,
    241         cache_data=cache_data,
    242         ckpt_path=ckpt_path,
    243     ).predictor

File ~\anaconda3\envs\lag_llama\lib\site-packages\gluonts\torch\model\estimator.py:205, in PyTorchLightningEstimator.train_model(self, training_data, validation_data, from_predictor, shuffle_buffer_length, cache_data, ckpt_path, **kwargs)
    202 trainer_kwargs = {**self.trainer_kwargs, "callbacks": callbacks}
    203 trainer = pl.Trainer(**trainer_kwargs)
--> 205 trainer.fit(
    206     model=training_network,
    207     train_dataloaders=training_data_loader,
    208     val_dataloaders=validation_data_loader,
    209     ckpt_path=ckpt_path,
    210 )
    212 logger.info(f"Loading best model from {checkpoint.best_model_path}")
    213 best_model = training_network.load_from_checkpoint(
    214     checkpoint.best_model_path
    215 )

File ~\anaconda3\envs\lag_llama\lib\site-packages\pytorch_lightning\trainer\trainer.py:529, in fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    504 def fit(
    505     self,
    506     model: "pl.LightningModule",
   (...)
    510     ckpt_path: Optional[_PATH] = None,
    511 ) -> None:
    512     r"""Runs the full optimization routine.
    513 
    514     Args:
    515         model: Model to fit.
    516 
    517         train_dataloaders: An iterable or collection of iterables specifying training samples.
    518             Alternatively, a :class:`~pytorch_lightning.core.datamodule.LightningDataModule` that defines
    519             the :class:`~pytorch_lightning.core.hooks.DataHooks.train_dataloader` hook.
    520 
    521         val_dataloaders: An iterable or collection of iterables specifying validation samples.
    522 
    523         datamodule: A :class:`~pytorch_lightning.core.datamodule.LightningDataModule` that defines
    524             the :class:`~pytorch_lightning.core.hooks.DataHooks.train_dataloader` hook.
    525 
    526         ckpt_path: Path/URL of the checkpoint from which training is resumed. Could also be one of two special
    527             keywords ``"last"`` and ``"hpc"``. If there is no checkpoint file at the path, an exception is raised.
    528 
--> 529     Raises:
    530         TypeError:
    531             If ``model`` is not :class:`~pytorch_lightning.core.LightningModule` for torch version less than
    532             2.0.0 and if ``model`` is not :class:`~pytorch_lightning.core.LightningModule` or
    533             :class:`torch._dynamo.OptimizedModule` for torch versions greater than or equal to 2.0.0 .
    534 
    535     For more information about multiple dataloaders, see this :ref:`section <multiple-dataloaders>`.
    536 
    537     """
    538     model = _maybe_unwrap_optimized(model)
    539     self.strategy._lightning_module = model

File ~\anaconda3\envs\lag_llama\lib\site-packages\pytorch_lightning\utilities\compile.py:125, in _maybe_unwrap_optimized(model)
    123         raise TypeError(f"`model` must be a `LightningModule`, got `{type(model).__qualname__}`")
    124     return model
--> 125 from torch._dynamo import OptimizedModule
    127 if isinstance(model, OptimizedModule):
    128     return from_compiled(model)

TypeError: `model` must be a `LightningModule` or `torch._dynamo.OptimizedModule`, got `LagLlamaLightningModule`

The text was updated successfully, but these errors were encountered:

ashok-arjun · 2024-05-02T21:40:29Z

Hi! I just checked and it works fine for me on Google Colab.
Which environment did you get this error on? Can you check if it works on google Colab?

ilteralp · 2024-05-06T00:39:40Z

Hi Arjun! Thank you for your reply.

The error occurred on a Windows environment using Conda.
I also tested it on Google Colab, and it worked fine.

Could the issue be related to a conflict in packages ? I'm attaching the output of conda list for reference.

Thanks a lot for your help!

ashok-arjun · 2024-05-12T10:55:07Z

Hi! I'm not sure where the issue this; we never tested this on the windows environment so there might be a conflict in the packages.

Maybe check if the conda packages match between the windows one and the colab one?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model TypeError while running Lag_Llama_Fine_Tuning_Demo notebook #57

Model TypeError while running Lag_Llama_Fine_Tuning_Demo notebook #57

ilteralp commented May 1, 2024

ashok-arjun commented May 2, 2024

ilteralp commented May 6, 2024

ashok-arjun commented May 12, 2024

Model TypeError while running Lag_Llama_Fine_Tuning_Demo notebook #57

Model TypeError while running Lag_Llama_Fine_Tuning_Demo notebook #57

Comments

ilteralp commented May 1, 2024

ashok-arjun commented May 2, 2024

ilteralp commented May 6, 2024

ashok-arjun commented May 12, 2024