tiered compaction: `identify_levels` bails too early #7775

problame · 2024-05-15T22:33:56Z

Spent some time reading tiered compaction code, found this bug with identify_levels.

@hlinnaka please confirm that this is indeed a bug / unintended behavior.

github-actions · 2024-05-15T22:46:26Z

No tests were run or test report is not available

Test coverage report is not available

_{The comment gets automatically updated with the latest test results
6593ce0 at 2024-05-15T22:46:26.289Z :recycle:}

hlinnaka · 2024-05-16T09:06:37Z

I believe the code is correct. It took me a while to re-understand how it works though, so more comments would probable be in order.

The repro test case returns 0 layers, not 3 as in the assertion, and that is correct.

hlinnaka · 2024-05-16T09:08:46Z

The trace from the test:

2024-05-16T09:05:47.647670Z  INFO pageserver_compaction::identify_levels: identify level at 0/10000, size 8192, num layers below: 3
2024-05-16T09:05:47.647687Z TRACE pageserver_compaction::identify_levels: inspecting 0000000000003000-0000000000004000__00008000-00009000 for candidate 0/10000, current best 0/10000
2024-05-16T09:05:47.647699Z TRACE pageserver_compaction::identify_levels: inspecting 0000000000002000-0000000000003000__00005000-00009000 for candidate 0/8000, current best 0/9000
2024-05-16T09:05:47.647710Z TRACE pageserver_compaction::identify_levels: too large 0000000000002000-0000000000003000__00005000-00009000, size 16384 vs 8192

That is correct. As soon as the function sees the too-large layer, it discards the "candidate" it was building, and returns with the "current_best", which is LSN 0/9000 and no layers. The sort function happens to reorder the layers to A, C, B, but you would get the same result with the A, B, C ordering. It would just bail out of the loop earlier.

hlinnaka · 2024-05-16T09:11:13Z

pageserver/compaction/src/identify_levels.rs

@@ -135,6 +135,7 @@ where
 // Is it small enough to be considered part of this level?
 if r.end.0 - r.start.0 > lsn_max_size {
 // Too large, this layer belongs to next level. Stop.
+ // Due to the sorting bug pointed out above there could still be smaller layers at same key range


An important point to notice here is that when we see a too large layer and bail out of the loop, we discard the candidate that we were building, and return with the "current best" safe stopping point that we had seen earlier.

hlinnaka · 2024-05-16T09:14:23Z

pageserver/compaction/src/identify_levels.rs

+ // The `identify_levels` loop will bails out at the first layer that is too large.
+ // , i.e., layer B. (log message "too large").
+ // That leaves layer C out of the level, even though it belongs to it.


It will in fact leave out all the layers from the returned Level. That's correct. The layers overlap, so it must include either all of them, or none. Because B is too large, they are all left out.

hlinnaka · 2024-05-16T09:21:51Z

I drew this diagram to help myself walk through the scenario from the test case:

        // The layers are processed in order A, B, C
        //
        //                                A           B            C
        //                       10000
        //
        //                        9000    +           +            +
        //                                |           |            |
        //                        8000    +           |            +
        //                        7000                |
        //                        6000                |
        //                        5000                +
        //                        4000
        //
        // current_best_start_lsn: 10000    9000
        // current_best_layers:       []      []
        // candidate_start_lsn:    10000    8000         
        // candidate_layers:          []     [A]
        //
        // A: Hooray, there are no crossing LSNs.
        // B: Too large. Discard 'candidate' and return with 'current_best'
        //

Read the diagram above from left to right. Walk through the iterations:

At entry to the function, current_best_start_lsn and candidate_start_lsn are set to 10000, and current_best_layers and `candidate_layers' are empty.
Process layer A. Because r.end (9000) <= candidate_start_lsn (10000), it takes the "Hooray, there are crossing LSNs" codepath. It updates current_best_start_lsn to r.end==9000, and starts building a new candidate with candidate_start_lsn set to r.start==8000.
Process layer B. It overlaps with candidate_start_lsn (8000), so "current best" is not updated. It is found to be too large, and we break out of the loop, and return with the "current best" of 9000 and empty layer set.

hlinnaka · 2024-05-16T09:27:24Z

I wrote a comment with an overview description of how the loop in identify_levels() works: #7777. I hope that clarifies this to future readers.

arpad-m · 2024-05-16T18:03:12Z

pageserver/compaction/src/identify_levels.rs

@@ -308,6 +309,26 @@ mod tests {
 Ok(())
 }

+ #[tokio::test]
+ async fn repro_identify_levels_bails_too_ealy_if_partitioned_keyspace_same_lsn() -> anyhow::Result<()> {
+ tracing_subscriber::fmt::init(); // so that RUST_LOG=trace works


This will fail if there is multiple tests using it because you can set it only once per process.

Instead, I suggest copy-pasting from tests.rs:

static LOG_HANDLE: OnceCell<()> = OnceCell::new(); pub(crate) fn setup_logging() { LOG_HANDLE.get_or_init(|| { logging::init( logging::LogFormat::Test, logging::TracingErrorLayerEnablement::EnableWithRustLogFilter, logging::Output::Stdout, ) .expect("Failed to init test logging") }); }

reprpducer: identify_levels bails too early

6593ce0

problame requested a review from hlinnaka May 15, 2024 22:33

problame mentioned this pull request May 15, 2024

Epic: productionize tiered compaction #7554

Open

hlinnaka reviewed May 16, 2024

View reviewed changes

hlinnaka mentioned this pull request May 16, 2024

improve comments in identify_levels #7777

Draft

arpad-m reviewed May 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tiered compaction: `identify_levels` bails too early #7775

tiered compaction: `identify_levels` bails too early #7775

problame commented May 15, 2024

github-actions bot commented May 15, 2024

hlinnaka commented May 16, 2024

hlinnaka commented May 16, 2024 •

edited

hlinnaka May 16, 2024

hlinnaka May 16, 2024

hlinnaka commented May 16, 2024

hlinnaka commented May 16, 2024

arpad-m May 16, 2024

tiered compaction: identify_levels bails too early #7775

Are you sure you want to change the base?

tiered compaction: identify_levels bails too early #7775

Conversation

problame commented May 15, 2024

github-actions bot commented May 15, 2024

No tests were run or test report is not available

Test coverage report is not available

hlinnaka commented May 16, 2024

hlinnaka commented May 16, 2024 • edited

hlinnaka May 16, 2024

Choose a reason for hiding this comment

hlinnaka May 16, 2024

Choose a reason for hiding this comment

hlinnaka commented May 16, 2024

hlinnaka commented May 16, 2024

arpad-m May 16, 2024

Choose a reason for hiding this comment

tiered compaction: `identify_levels` bails too early #7775

tiered compaction: `identify_levels` bails too early #7775

hlinnaka commented May 16, 2024 •

edited