Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The discussion started in #1696 . It's particularly interesting to me because it may fulfill my wish for years to properly have a running estimation of
S^-1
, so I hurried this while away for a meeting, and let's see if it really works well in our use cases.We need to test if it fully supports complex numbers and MPI, and also discuss the API.
I added the arguments
proj_reg
andmomentum
to the current API ofVMC_SRt
. Whenmomentum = 0
, it degenerates to the previous SRt, and the matrixP
will not affect the training as long as the training was already numerically stable.Here I call the matrix
P
'projector regularization', or maybe we can call it 'ones regularization'. Multiplying it by any factor will not change the updates, as long as it's numerically stable, even if we split the complex parameters. I've tested it numerically.A
N * N
matrix of ones has only one nonzero eigenvalue, which isN
. To make it not too large, we divide it byN
by default. In principle the user can adjust the factorproj_reg
but I guess it's not needed in most cases.The momentum subtracted from
dv
and added toupdates
are implemented in the preconditioner. For the gradient norm clipping, however, we can use optax to do it after the preconditioner.By the way, after we clean up the driver API (e.g. in #1674 ), maybe it's a good time to factor out SRt to be actually a preconditioner.
cc @attila-i-szabo @riccardo-rende @llviteritti