Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H). #11

Open
wizard1203 opened this issue Dec 11, 2021 · 0 comments
Open

Comments

@wizard1203
Copy link

Hi, thanks for your awesome work!

I noticed that the results in the paper: PYHESSIAN: Neural Networks Through the Lens of the Hessian, the tr(H) keeps increasing during training.

image

And in this paper: Hessian-based Analysis of Large Batch Training and Robustness to Adversaries, the dominant eigenvalue of the Hessian w.r.t weights could decrease during small-batch training.

image

And in this paper: CRITICAL LEARNING PERIODS IN DEEP NETWORKS. The trace of FIM increases first, and decrease.
image

Are there some relationships between them? Are they inconsistent from others?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant