Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ESDB-106-15] Always replicate the chunk header #4211

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

shaan1337
Copy link
Member

@shaan1337 shaan1337 commented Mar 21, 2024

Changed: Always replicate the chunk header

This PR causes followers to always replicate the chunk header from the leader. It is a pre-requisite for the encryption at rest plugin. Prior to this PR, the chunk header was replicated only for scavenged chunks.

This change has an additional benefit: it's easy for sysadmins to compare files on a leader and a follower by using standard tools like md5sum, sha1sum, etc. without needing to exclude the first 128 bytes from the comparison (as previously, the chunk IDs were different on the leader and followers)

Copy link

linear bot commented Mar 21, 2024

@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch 9 times, most recently from 898b1cb to b5278aa Compare March 22, 2024 08:26
@shaan1337 shaan1337 marked this pull request as ready for review March 22, 2024 12:00
@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch 2 times, most recently from 95e86ca to 3461abd Compare March 22, 2024 12:46
@shaan1337
Copy link
Member Author

Please don't merge yet, as some additional tests are probably needed.

@hayley-jean hayley-jean changed the title Always replicate the chunk header [DB-719] Always replicate the chunk header [ESDB-106-2] Apr 22, 2024
@shaan1337 shaan1337 changed the title Always replicate the chunk header [ESDB-106-2] [ESDB-106-15] Always replicate the chunk header Apr 25, 2024
@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch from 56ec6f1 to c1893f9 Compare May 15, 2024 10:31
@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch 2 times, most recently from 8637e02 to c55c0ed Compare May 27, 2024 10:49
Copy link
Member

@hayley-jean hayley-jean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicated chunks seem to lose the version in their filenames when they are replicated.
e.g. chunk-000001.000001 on the leader is replicated as chunk-000001.000000 on the follower.

src/EventStore.Core/Cluster/ReplicationPartials.cs Outdated Show resolved Hide resolved
@shaan1337
Copy link
Member Author

The replicated chunks seem to lose the version in their filenames when they are replicated. e.g. chunk-000001.000001 on the leader is replicated as chunk-000001.000000 on the follower.

nice find! I believe that this bug already existed prior to this PR due to this call in SwitchChunk:

_config.FileNamingStrategy.DetermineBestVersionFilenameFor(chunkHeader.ChunkStartNumber);

it could easily be fixed by specifying a minimum value of 1 for the chunk version (although the versions wouldn't necessarily match with the leader)

it's quite important to fix it as some places in the code assume that a chunk with version 0 hasn't been scavenged:
e.g on startup, it assumes that the next chunk number is just incremented by one:

@shaan1337
Copy link
Member Author

shaan1337 commented Jun 4, 2024

nice find! I believe that this bug already existed prior to this PR due to this call in SwitchChunk:

sorry, my bad, it is actually a bug introduced in this PR, i'll push a fix for it (EDIT: pushed)

@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch 2 times, most recently from d8874a6 to 0f61015 Compare June 5, 2024 06:27
This commit brings the following changes:

* Always replicate the chunk header to followers - prior to this commit, chunk headers were replicated only for scavenged chunks
* Doing the above implies that we can no longer create new chunks in advance:
  * on followers, when completing a chunk
  * when a new database is spinned up. the chunk is now created when a node becomes a leader and is writing an epoch.
  * when an existing database is spinned up but the writer checkpoint was at a chunk boundary
  * when truncating to a chunk boundary, we also need to delete the chunk at the boundary to re-replicate the chunk (incl. the header) from the leader

Backward/Forward compatibility is preserved.

Other changes:
* cluster.proto: For clarity, rename `is_completed_chunk` to `is_scavenged_chunk` because that's what it really means (in the early days, completed chunks were replicated raw)
@shaan1337 shaan1337 force-pushed the db-719-replicate-chunk-header branch from 0f61015 to c607652 Compare June 5, 2024 10:20
Copy link
Contributor

@timothycoleman timothycoleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice looks good.
I've tested backwards and forwards compatibility for replication

@timothycoleman timothycoleman dismissed hayley-jean’s stale review June 5, 2024 14:51

Hayley on leave, her review points have been addressed

@timothycoleman timothycoleman merged commit 1d4aa54 into master Jun 5, 2024
8 checks passed
@timothycoleman timothycoleman deleted the db-719-replicate-chunk-header branch June 5, 2024 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants