Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix numDeletesToMerge for unchanged segments #13324

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Apr 28, 2024

Currently, we do not trigger numDeletesToMerge for unchanged segments that have not yet entered the reader pool. Consequently, there are cases where force merges overlook segments with soft-deletes, as the merge policy only takes into account hard deletes when determining merge specifications.

@@ -6128,7 +6128,7 @@ public final int numDeletesToMerge(SegmentCommitInfo info) throws IOException {
ensureOpen(false);
validate(info);
MergePolicy mergePolicy = config.getMergePolicy();
final ReadersAndUpdates rld = getPooledInstance(info, false);
final ReadersAndUpdates rld = getPooledInstance(info, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pass create=true then we don't need to handle the case when rld is null below? There is a comment there saying that returning the number of hard deletes is safe, which confuses me since you're suggesting it's actually a problem? Maybe @s1monw can help shed a light?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just re-read the change that I made that added this comment and I think it's not allowed to pull a reader here if it's not already visible. I don't recall the side-effects of this but I am sure there are some. To me the test seems to be artificial since it's really just testing if that works on fresh writer?! I am not against the change I just think we need more coverage for this to ensure it's not triggering the same issues that caused the change in the first place?
I think IW should be very conservative here with pulling memory for segments that are not in memory yet.

Yet, the comment basically means that we can't return the soft del count since it might overcommit because of it's nature of being dynamic based on the MP.

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants