-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run full GC only for auto recovery scenarios #21134
Conversation
⯆ @fluid-example/bundle-size-tests: -3.41 KB
Baseline commit: 7dcb473 |
@@ -374,7 +371,7 @@ export class GarbageCollector implements IGarbageCollector { | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think initializeOrUpdateGCState
should early return if it's not enabled, to avoid all the logging on "connected"
(this comment is pinned to a random line, not related to this point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializeOrUpdateGCState
is called from setConnectionState
which already has this check.
@@ -855,7 +849,7 @@ export class GarbageCollector implements IGarbageCollector { | |||
trackState: boolean, | |||
telemetryContext?: ITelemetryContext, | |||
): ISummarizeResult | undefined { | |||
if (!this.configs.shouldRunGC || this.gcDataFromLastRun === undefined) { | |||
if (!this.shouldRunGC || this.gcDataFromLastRun === undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine (more future-proof) but I think we only need to check this.gcDataFromLastRun
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
@@ -515,7 +513,7 @@ export class SummarizerNodeWithGC extends SummarizerNode implements IRootSummari | |||
* was previously used and became unused (or vice versa), its used state has changed. | |||
*/ | |||
private hasUsedStateChanged(): boolean { | |||
// If GC is disabled, we are not tracking used state, return false. | |||
// If GC is disabled, it should not affect summary state, return false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this got back to always false.
assert(gc.configs.sweepEnabled, "sweepEnabled incorrect"); | ||
assert.equal(gc.configs.shouldRunSweep, "NO", "shouldRunSweep incorrect"); | ||
assert( | ||
gc.configs.sessionExpiryTimeoutMs === undefined, | ||
"sessionExpiryTimeoutMs incorrect", | ||
); | ||
assert(gc.configs.tombstoneTimeoutMs === undefined, "tombstoneTimeoutMs incorrect"); | ||
assert.equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to add an assert on gc.configs.gcVersionInBaseSnapshot
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to this and other places where I removed this check.
); | ||
}); | ||
|
||
it("shouldRunGC should be false when gcVersionInEffect is older than gcVersionInBaseSnapshot", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we leave this and just flip false to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Added it back.
@@ -762,72 +682,72 @@ describe("Garbage Collection configurations", () => { | |||
}); | |||
describe("shouldRunSweep", () => { | |||
const testCases: { | |||
shouldRunGC: boolean; | |||
gcEnabled: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename to gcEnabled_doc
sweepEnabled_doc: true, | ||
sweepEnabled_session: true, | ||
disableDataStoreSweep: "viaConfigProvider", | ||
expectedShouldRunSweep: "ONLY_BLOBS", | ||
}, | ||
{ | ||
shouldRunGC: true, | ||
gcEnabled: true, | ||
sweepEnabled_doc: true, | ||
sweepEnabled_session: true, | ||
shouldRunSweep: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind renaming this to runSweepOverride
or runSweep_config
? It's very confusing that this input is called the same name as the output property being examined.
@@ -22,164 +22,13 @@ type GCSummaryStateTrackerWithPrivates = Omit<GCSummaryStateTracker, "latestSumm | |||
}; | |||
|
|||
describe("GCSummaryStateTracker tests", () => { | |||
describe("Summary state reset", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this red. Really nice that the summarizer node hasChanged
logic is sufficient for all these cases.
@@ -63,6 +64,7 @@ describeCompat("GC version update", "NoCompat", (getTestObjectProvider, apis) => | |||
let dataStore1Id: string; | |||
let dataStore2Id: string; | |||
let dataStore3Id: string; | |||
let baseGCDetailsSpy: Sinon.SinonSpy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
const baseGCDetails: IGarbageCollectionDetailsBase = { | ||
gcData: { | ||
gcNodes: {}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe explicitly say usedRoutes: undefined
*/ | ||
async function summarizeAndValidateGCStateReset(summarizer: ISummarizer) { | ||
const containerRuntime = (summarizer as any).runtime as IContainerRuntimeWithPrivates; | ||
const spy = sandbox.spy(containerRuntime.garbageCollector, "getBaseGCDetails"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the sandbox object do exactly? Just lets you spy on stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. That among other things. I just copied the pattern from other tests.
dataStoresAsHandles, | ||
true /* gcEnabled */, | ||
); | ||
// Validate that the GC state is reset in the base GC details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you talking about this. Let me make sure I got it -
The tests used to validate that no handles were used, because fullGC/fullTree summary was expected. Now we certainly don't force fullGC/fullTree, trusting that any node with a change to used routes will be resummarized based on summarizerNodeWithGc.hasChanged
.
Now since we don't force all nodes to regenerate summary tree, you're asserting the more direct outcome that the GC state was reset in the new Summary (by forcing it to summarize to trigger GC init and then checking was baseGCDetails in GC was).
Is it true that you also could assert that whichever dataStores had usedRoutes before would be Trees and the rest would be Handles? Not suggesting it, just checking my understanding.
Please correct me where I'm wrong!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct. Basically, there are 2 main points that ensure right things will happen:
- GC must re-run - For this, base GC state should be initialized as empty. This means all nodes start with empty GC data. So, next time GC runs, it will have to regenerate the GC data.
- Summary for nodes with incorrect reference state must be regenerated - After step 1 above, GC data is regenerated. If any node has its used routes (aka reference state) changed, it's summary will be regenerated because
hasChanged
will return true.
The tests validate # 1 above and # 2 is validated in unit tests. It's very hard to validate # 2 here because it will require changing the GC data generated by a node between 2 summaries without actually sending any ops (or that will trigger regeneration) because that's the real world scenario we are targeting. I could hack around and do it but it's not necessary.
@@ -98,14 +100,27 @@ describeCompat("GC version update", "NoCompat", (getTestObjectProvider, apis) => | |||
return summaryResult.summaryVersion; | |||
} | |||
|
|||
/** | |||
* Generates a summary and validates that the GC state is reset (empty) in the base GC details. This will ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the base GC details
This is about the snapshot loaded from, not the result of the summary here, right? The summarize just serves to ensure GC is initailized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, updated the comment to make that clear.
Currently, GC runs will fullGC a little often which also results in running summarize with fullTree which is expensive. This PR updates the logic to ensure that full GC / full tree summary will only happen if needed. It also removes the option to enable / disable GC mark phase via feature flags.
Here is the list of scenarios where full GC / full summary runs today and what is the change in this PR:
AB#8037