`trainer.callback_metrics` included in `metric_dict` after training doesn't make sense #597

libokj · 2023-08-25T18:26:48Z

After checking https://github.com/Lightning-AI/lightning/blob/105b25c521e0cbc5d7b1160902ce7b64ae7c8c73/src/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py, it became clear that after finishing a complete trainer.fit loop, trainer.callback_metrics is just the metrics from the last training and validation epochs. I think it doesn't make much sense to include the training and validation metrics from the last epoch in metric_dict for sweeper optimization.

There doesn't seem to be a straightforward way to get the training and validation metrics of the best model as monitored by ModelCheckpoint, but I think a custom callback based on ModelCheckpoint to save the best_model_metrics or simply running trainer.test(..., ckpt='best') on the training and validation datasets respectively are two viable options.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`trainer.callback_metrics` included in `metric_dict` after training doesn't make sense #597

`trainer.callback_metrics` included in `metric_dict` after training doesn't make sense #597

libokj commented Aug 25, 2023 •

edited

trainer.callback_metrics included in metric_dict after training doesn't make sense #597

trainer.callback_metrics included in metric_dict after training doesn't make sense #597

Comments

libokj commented Aug 25, 2023 • edited

`trainer.callback_metrics` included in `metric_dict` after training doesn't make sense #597

`trainer.callback_metrics` included in `metric_dict` after training doesn't make sense #597

libokj commented Aug 25, 2023 •

edited