Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read from branch with compaction data #7701

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

Conversation

idanovo
Copy link
Contributor

@idanovo idanovo commented Apr 25, 2024

No description provided.

@idanovo idanovo added the exclude-changelog PR description should not be included in next release changelog label Apr 25, 2024
@idanovo idanovo self-assigned this Apr 25, 2024
@idanovo idanovo linked an issue Apr 25, 2024 that may be closed by this pull request
Copy link

github-actions bot commented Apr 25, 2024

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Really exciting to see how we can separate the two functions of "the" (currently single) branch metarange ID: right now it is used both as the parent for the next commit, and also as the base for differences in staging.

I would prefer to avoid the term "compaction" here, and instead name the new metarange ID on a branch by some other name - maybe BaseMetarangeID. It may make our lives easier if we ever have to add user-requested "partial commits". The name we pick right now matters, because it it stored to the KV. So it will be expensive to change in future!

I am not sure why we need a "compaction manager". All it does is duplicate the code that we already have in Graveler, just for a different metarange. We even need the same if statements to select the right base metarange. Furthermore, AFAICT this compaction manager leads to these bugs:

  • An SSTable handle leaks whenever a compacted metarange exists on the branch. This leaks a file descriptor as well as memory.
  • The code opens an unused metarange whenever a compacted metarange exists on the branch. This is a loss of efficiency. IOPs (I/O operations) are sometimes an expensive resource on lakeFS installations.

Can we try to select the correct metarange in if statements before accessing the commit metarange, instead of having the manager? I believe that that code would be easier to understand, and provide fewer opportunities for bugs.

pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
@itaiad200
Copy link
Contributor

itaiad200 commented Apr 28, 2024

Agree with @arielshaqed comments. Is this PR goal to cover all branch reading scenarios? What about functions that are not covered in this PR, like listStagingArea, isStagingEmpty, ..?

Copy link
Contributor

@guy-har guy-har left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Ariel, I would expect compaction to be handled as any other metarange. furthermore, compaction made the term staging a tricky one
i.e some would expect listStagingArea to return only data existing under the staging token and others may expect it to return all staging data (IMO all staging data is the trivial one). Also, we still have many places not considering the compacted data, for example graveler.Get and graverler.Commit which should return "dirty" if staging is empty, etc...

pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests! With Commit, StagingToken and SealedTokens already present, and with the addition of BaseMetaRange concept, the number of test scenarios is huge. I think we missed some important ones, like testing with tombstones, testing with all 4 (or 5 for multiple sealed tokens) keys, querying for empty staging area, etc. The reason we need to be pedantic about changes here, is that bugs in the versioning engine can be catastrophic. We intend on introducing the compaction operation itself later and that's even more worrying! Finding the bugs are normally easier during the PR itself, and it gets harder as we build on top of it.

You can look at graveler_v2_test.go file too. I find it easier to write the tests with proper generated mocks instead of limited fakes.

Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed everything except graveler_test.go (will get back to it later).
Overall we're almost there! Bunch of medium/small comments..

pkg/catalog/catalog_test.go Show resolved Hide resolved
pkg/catalog/catalog_test.go Show resolved Hide resolved
pkg/catalog/gc_write_uncommitted.go Outdated Show resolved Hide resolved
pkg/catalog/gc_write_uncommitted.go Outdated Show resolved Hide resolved
pkg/catalog/gc_write_uncommitted.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
@@ -3109,6 +3164,14 @@ func (g *Graveler) Diff(ctx context.Context, repository *RepositoryRecord, left,
leftValueIterator.Close()
return nil, err
}
if rightBranch.CompactedBaseMetaRangeID != "" {
compactedDiffIterator, err := g.CommittedManager.Diff(ctx, repository.StorageNamespace, leftCommit.MetaRangeID, rightBranch.CompactedBaseMetaRangeID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftCommit & rightBranch.CompactedBaseMetaRangeID? Can you explain why this is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there's no compaction data on the right we return a combined iterator with:

  1. diff iterator between the left committed data and the right committed data
  2. iterator of committed data in the left branch
  3. iterator of the staging data of the right branch

So in case there's compacted data on the right branch we need to change the first iterator to contain also changes that have already compacted to the right branch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case there's no compaction data on the right we return a combined iterator with:

But this path handles the case where rightBranch.CompactedBaseMetaRangeID != ""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, why isn't it JoinedDiff(stagingIterator, compactedDiffIterator)? Isn't iterator of committed data in the left branch already being diffed in compactedDiffIterator?

pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler_v2_test.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
@idanovo idanovo requested review from guy-har and itaiad200 May 7, 2024 09:05
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, but I think there's some valuable comments already

pkg/catalog/catalog_test.go Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
}
}

if updatedValue == nil && reference.CompactedBaseMetaRangeID != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get here with updatedValue != nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case the object didn't change in staging

pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
@@ -2492,7 +2512,7 @@ func (g *Graveler) ResetPrefix(ctx context.Context, repository *RepositoryRecord
newSealedTokens = append(newSealedTokens, branch.SealedTokens...)

// Reset keys by prefix on the new staging token
itr, err := g.listStagingArea(ctx, branch, 0)
itr, err := g.listStagingAreaWithoutCompaction(ctx, branch, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't compaction be involved too? AFAIR ResetPrefix resets all the uncommitted changes in a prefix. Why shouldn't it reset changes that exist in compacted area too?

pkg/graveler/graveler.go Outdated Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
@idanovo
Copy link
Contributor Author

idanovo commented May 8, 2024

Graveler function Tested Controller function Tested AI
ResetKey X ResetBranch V Add a controller test for ResetKey for a compacted branch when we add the CompactBranch API call
ResetPrefix X ResetBranch V Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call
prepareForCommitIDUpdate X Revert (dirty branch) V Added a test in graveler testsShould also add a test in the controller once we have the CompactBranch API call
prepareForCommitIDUpdate X CherryPick (dirty branch) V Add a controller test for CherryPick for a compacted branch when we add the CompactBranch API call
prepareForCommitIDUpdate X UpdateBranch (dirty branch) X Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call
prepareForCommitIDUpdate X Merge (dirty branch) V Added a test in graveler testsShould also add a test in the controller once we have the CompactBranch API call
prepareForCommitIDUpdate X Import (dirty branch) X Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call

@idanovo idanovo requested a review from itaiad200 May 8, 2024 12:58
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pushing this one through, it's certainly wasn't an easy one.
Once it's merged, please open an issue containing the tests table as a dependency for the next compaction task.

pkg/catalog/catalog_test.go Show resolved Hide resolved
pkg/graveler/graveler.go Show resolved Hide resolved
pkg/graveler/graveler.go Show resolved Hide resolved
pkg/graveler/graveler.go Outdated Show resolved Hide resolved
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only blocking diffs are on JoinedDiffIterator. It is really important code, and needs to be as simple as possible. I think we may have simpler code if we revisit some decisions.

One way might be this. "Advance inner" (now, after seeing the code) seems like the wrong strategy. AFAICT it forces us to assume a particular special DiffIterator.Value() return of nil when the iterator is over. And then it looks like we check almost the same cases but not quite. Because there are so many, I worry that we will miss some edge cases.

numBranch: 3,
numRecords: 0,
expectedCalls: 1,
compactBranch: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Rather than double each test scenario, I would probably just loop over both scenarios by saying for _, compactBranch := range{false, true} { t.Run(...) } on l. 585.

}

if numRecords > 0 {
test.GarbageCollectionManager.EXPECT().
GetUncommittedLocation(gomock.Any(), gomock.Any()).
Times(expectedCalls).
AnyTimes().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Particularly worried that we might never call this. If no UGC occurs, will the test fail somewhere else?

Comment on lines +59 to +64
func normalizeStorageNamespace(namespace string) string {
if !strings.HasSuffix(namespace, DefaultPathDelimiter) {
namespace += DefaultPathDelimiter
}
return namespace
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we must already have this function somewhere!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have found in the controller the function 'normalizePhysicalAddress,' and it looks similar, but it doesn't perform the same function.

Comment on lines +403 to +404
// CompactedBaseMetaRangeID - the MetaRangeID of the last compaction's
CompactedBaseMetaRangeID MetaRangeID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not a fan of the "compacted" part of the name. It describes how it happened not what it is. I would prefer to use "BaseMetaRangeID". It is important to get the name right before any users - because if we get it wrong it will be hard to fix existing KVs.

In future we might have other reasons to change the base. For one example: during concurrent merges, one of the merges will fail when it tries to update the branch record of the destination. That is inefficient and slow. We could use your work here to speed up concurrent merges! After the merge fails like this we might optionally update the base metarange of the source branch to point at the intended result. Now the source branch looks like the destination branch got merged into it, so retrying the merge will be faster.

(This is just one example, I have many plans for this feature. I don't want to tie them to the word "compacted"!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @itaiad200 was worried that the word "compact" was missing for this variable.
Decision time 😈

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itaiad200 can you chime in, please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not tying ourselves to Compacted means that we add another term on top of staging, sealed, compaction and uncommitted.. For example, LastStagingSnapshot - Pro: It describes what it is, not obscure. Con: Another term.

I still hold my opinion but not firmly, I'd rather it be CompactedStaging over BaseMetaRangeID. I believe it does describe what it is, not necessarily the action that happened.
So up to you, not blocking :)

pkg/graveler/graveler_v2_test.go Show resolved Hide resolved
type JoinedDiffIterator struct {
iterA DiffIterator
iterB DiffIterator
p DiffIterator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document p?

// first
nextA = c.iterA.Next()
nextB = c.iterB.Next()
case valA == nil && valB == nil:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to depend on nil values like this. I think that the problem is you already called Next on the iterators by the time you call this function; perhaps we should figure out how to avoid that?

Comment on lines 67 to 69
if !c.advanceInnerIterators() {
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more complex than it should be. You have to call c.Next() the first time, before using the iterator. That's why you need to have the special case for p == nil in advanceInnerIterators. This function:

  1. Relies on DiffIterator returning nil value in certain conditions. Regardless of whether or not it does this, relying on this makes it a very special case - most Go iterators do not support this!
  2. We repeatedly call Value() on each iterator. This is weird.
  3. We have 2 switch statements. The first has 7 cases and 2 special cases in if statements, the second has 5 cases. This is too complex.

Comment on lines 97 to 99
if c.p != nil && c.p.Err() != nil {
return c.p.Err()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. c.p is always either c.iterA or c.iterB. Why do we need to check it separately?

if c.p != nil && c.p.Err() != nil {
return c.p.Err()
}
if c.iterA != nil && c.iterA.Err() != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is c.iterA == nil?

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still worried about the iterator. The switch statement in there is huge, which makes it too hard for me to understand. I think it has issues, but TBH I really don't know.

So still requesting changes, if only to documentation.

Comment on lines +403 to +404
// CompactedBaseMetaRangeID - the MetaRangeID of the last compaction's
CompactedBaseMetaRangeID MetaRangeID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itaiad200 can you chime in, please?

Comment on lines +1847 to +1848
if branchRecord.Branch.CompactedBaseMetaRangeID != "" {
metaRangeID = branchRecord.Branch.CompactedBaseMetaRangeID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand: AFAICT this is a performance degradation if there has been no compaction! Why wouldn't it be sufficient to take the "cachedMetaRangeID" in the "else"? (This is very important because otherwise we GetCommit. It's cached, but that's still unnecessary overhead on the commit cache to get a commit we might already have had!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call deleteUnsafe function from two places, both pass nil for cachedMetaRangeID

iterAHasMore bool
iterB DiffIterator
iterBHasMore bool
p DiffIterator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please document p. The name is very unintuitive.

Comment on lines 47 to 50
case bytes.Equal(valA.Key, valB.Key):
c.iterAHasMore = c.iterA.Next()
c.iterBHasMore = c.iterB.Next()
case bytes.Compare(valA.Key, valB.Key) < 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This traverses the keys twice. Please compare once; it is fine to call a separate routine here, but that will need to be documented. In fact you can return immediately in all the above cases, and then compute bytes.Compare(val!.Key, valB.Key) once.

c.iterBHasMore = c.iterB.Next()
case bytes.Compare(valA.Key, valB.Key) < 0:
c.iterAHasMore = c.iterA.Next()
c.iterBHasMore = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this line? After line 43 we know that iterBHasMore.

default:
// value of iterA < value of iterB
c.iterBHasMore = c.iterB.Next()
c.iterAHasMore = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

return false
}
if !c.iterAHasMore && !c.iterBHasMore {
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Suppose that the last values of iterA and iterB refer to the same key. Then these lines will make this condition true -- and the iterator misses the last value!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We process only values we already returned in the first switch-case statement.
I'll add a test case just to make sure.

Comment on lines 75 to 76
valA = c.iterA.Value()
valB = c.iterB.Value()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If c.iterAHasMore then it is incorrect to call c.iterA.Value(), and similarly for iterB.

Comment on lines +114 to +115
c.iterA.SeekGE(id)
c.iterB.SeekGE(id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems incorrect: AFAIR you must call Next() after SeekGE().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true.
This is the same behavior we have in our CombinedIterator.
I don't know if we want to have the same behavior here or change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude-changelog PR description should not be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manage read from branch with compaction
4 participants