Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Usage and product changes
We implement the architecture and most of the implementation required for tracking data statistics. The statistics will primarily be used for query planning. We achieved several design goals:
However, we take the trade off that the statistics are not always up-to-date. The update frequency is parameter we can optimise.
In the end, there is a single database-wide statistics struct, which is immutable and updated periodically. We update it by scanning the data WAL records and summing the count deltas since, and then replacing the Statistics struct held by the database atomically. The statistics are checkpointed into the WAL, which also allows us to time-travel to older snapshots and find a relatively accurate statistics entry from near that version.
This means we are solidifying the requirement that WAL cleaning, and MVCC compaction are tied to the same time-scale - both are required for going back to previous data versions correctly.
Future work
We could find that reading from the WAL to update statistics is a bottleneck. We can solve several problems at once, by extracting the commit data "cache" from the IsolationManager into the
DurabilityClient
, which can then be shared across isolation and statistics operations.Implementation
Architecture
Promote WAL and Checkpoint management into Database
Checkpoint
and associatedcommit_replay
methods into a new module://storage/recovery
Split database creation and loading into two separate entry points, and rearrange the corresponding methods all the way down into Storage, WAL, and Checkpointing. We also update corresponding tests.
Given a database directory, each module creates its own subdirectory:
MVCCStorage
createsdb-name/storage
WAL
createsdb-name/wal
Checkpoint
createsdb-name/checkpoint
Introduce
//concept/thing/Statistics
, which storesthing
statistics for instances of each type, role playing, and relation indexing, etc.We checkpoint statistics into the WAL using a new record type, and load the last one on bootup. This also helps solve the MVCC time-travel problem, where going back in time could lead to mismatched statistics being used (or even not relevant for the different schema!). We probably allow using the existing Statistics if the "old" sequence number being opened is no more than N (~100) versions behind the statistics version.
This means we are solidifying the requirement that WAL cleaning, and MVCC compaction are tied to the same time-scale - both are required for going back to previous data versions correctly.
Statistics catch-up/synchronisation is implemented by reading data commit records from the WAL.
CommitType
to theCommitRecord
generated byCommittableSnapshot
s. This allows deserialising and recreating the correct type of snapshot (data or schema) from a WAL entry.Refactor out the
//storage/durability
module into//durability
package, which contains a simplifiedService
traitDurabilityClient
trait, which is now used throughout the code base whereverDurabilityService
was used beforeclient
to communicate with a set of durability servers, and manage collecting ordering information, etc.WALClient
which wraps a WAL but conforms to theDurabilityClient
traitUX
We create a more consistent/comprehensive error structure for what happens if any of storage/wal/checkpoint are not present on bootup.
The presence or absence of the
storage
directory is irrelevant to bootup/recovery (same path). Being present simply optimises the recovery process since we have to copy fewer files from the checkpoints.storage
with the checkpoint and replay the WAL since the checkpointdb-name
database directory is present, but no WAL is present, this is an error state.