-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental checkpoint serialisation #111
Comments
For what it's worth, I think some people consider 'multiple checkpoints per file' to be a misfeature. multiple events per file makes sense -- it saves a lot of overhead that would otherwise make acid-state way too slow. But I am not sure there is any advantage of multiple checkpoint per file -- it is just something that can happen due to the current implementation. In fact, people will use a combination of createCheckpoint and createArchive to try to ensure that they do not get multiple checkpoints in the same file. |
Right, I was wondering if it might be worth moving to one-checkpoint-per-file. I haven't thought about implementation / backwards compatibility, but it seems like it would make things simpler. Perhaps we can ensure that an exception during checkpoint serialisation merely aborts the current checkpoint (perhaps leaving a half-written file on disk) and throws the exception from |
At the moment, writing a checkpoint causes
acid-state
to realise the entire serialised representation in memory before it gets written to disk. This can be a significant memory cost for server applications with alarge state. It seems unavoidable with the existing archive backend, because the format consists of (length, CRC, bytestring) where the first two fields are not known until the bytestring is fully evaluated. However, we should be able to do better with an alternate backend (given #96) that stores chunk lengths rather than a single overall length, and moves the CRC to the end.
This raises a question: what should we do if an exception is thrown during checkpoint serialisation? In particular, this can happen if user code stores an unevaluated error thunk in the state. We already fail to handle this case gracefully (see #38). Should we simply document that the state must never contain pure exceptions? Given the possibility of multiple checkpoints per file, it seems hard to recover from a partially-written checkpoint.
The text was updated successfully, but these errors were encountered: