Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trainer.callback_metrics included in metric_dict after training doesn't make sense #597

Open
libokj opened this issue Aug 25, 2023 · 0 comments

Comments

@libokj
Copy link

libokj commented Aug 25, 2023

After checking https://github.com/Lightning-AI/lightning/blob/105b25c521e0cbc5d7b1160902ce7b64ae7c8c73/src/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py, it became clear that after finishing a complete trainer.fit loop, trainer.callback_metrics is just the metrics from the last training and validation epochs. I think it doesn't make much sense to include the training and validation metrics from the last epoch in metric_dict for sweeper optimization.

There doesn't seem to be a straightforward way to get the training and validation metrics of the best model as monitored by ModelCheckpoint, but I think a custom callback based on ModelCheckpoint to save the best_model_metrics or simply running trainer.test(..., ckpt='best') on the training and validation datasets respectively are two viable options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant