-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split a per-partition WriteRequest into multiple Kafka records if bigger than max allowed size #8077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave this PR an early look, and it would work.
I wonder if it would have been easier to work at serialized-message level, splitting the message by fields with tag 1 (timeseries
) and tag 3 (metadata
) until they fill the size, while copying tag 2 (source
) and 1000 (skip_label_name_validation
) into each submessage.
pkg/mimirpb/custom.go
Outdated
} | ||
|
||
// We assume that different timeseries roughly have the same size (no huge outliers) | ||
// so we preallocate the returned slice just adding 1 extra item (+2 because a +1 is to round up). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could understand +1, but why +2 again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to round up, and +1 for an extra item. The +1 round up doesn't guarantees us space for 1 extra item (it depends what was the reminder of the division), but it's guaranteed by the 2nd +2. Does this answer your question?
pkg/mimirpb/custom.go
Outdated
return []*WriteRequest{partialReq} | ||
} | ||
|
||
// We assume that different timeseries roughly have the same size (no huge outliers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that size of each timeseries is dominated by labels, I have doubts that this assumption holds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice we split into 16MB partial requests. At this scale, the size of labels shouldn't matter much.
0379d8b
to
4991bef
Compare
Timeseries: timeseries, | ||
Metadata: metadata, | ||
Source: p.Source, | ||
SkipLabelNameValidation: p.SkipLabelNameValidation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to reviewers: not a bug today, because SkipLabelNameValidation
is only read in distributors before calling ForIndexes
but I think it's better to fix it.
…ger than max allowed size Fix partialReqSize reset Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…nt marshalling too Signed-off-by: Marco Pracucci <[email protected]>
goos: darwin goarch: amd64 pkg: github.com/grafana/mimir/pkg/mimirpb cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ sec/op │ sec/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 3.846µ ± ∞ ¹ 2.977µ ± ∞ ¹ -22.59% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 10.481µ ± ∞ ¹ 6.351µ ± ∞ ¹ -39.40% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 10.504µ ± ∞ ¹ 7.907µ ± ∞ ¹ -24.72% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 21.64µ ± ∞ ¹ 20.82µ ± ∞ ¹ -3.79% (p=0.016 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 66.02µ ± ∞ ¹ 41.98µ ± ∞ ¹ -36.41% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 65.66µ ± ∞ ¹ 42.75µ ± ∞ ¹ -34.88% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 216.4µ ± ∞ ¹ 131.3µ ± ∞ ¹ -39.36% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 61.38µ ± ∞ ¹ 61.34µ ± ∞ ¹ ~ (p=1.000 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 202.4µ ± ∞ ¹ 126.4µ ± ∞ ¹ -37.55% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 429.7µ ± ∞ ¹ 432.9µ ± ∞ ¹ ~ (p=0.548 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 1325.3µ ± ∞ ¹ 884.2µ ± ∞ ¹ -33.28% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 1343.9µ ± ∞ ¹ 875.8µ ± ∞ ¹ -34.83% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 404.2n ± ∞ ¹ 401.7n ± ∞ ¹ -0.62% (p=0.016 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1.731µ ± ∞ ¹ 1.016µ ± ∞ ¹ -41.31% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 2.034µ ± ∞ ¹ 1.678µ ± ∞ ¹ -17.50% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 7.739µ ± ∞ ¹ 7.740µ ± ∞ ¹ ~ (p=1.000 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 28.10µ ± ∞ ¹ 16.43µ ± ∞ ¹ -41.53% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 29.45µ ± ∞ ¹ 17.67µ ± ∞ ¹ -40.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 64.74µ ± ∞ ¹ 64.67µ ± ∞ ¹ ~ (p=0.690 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 203.5µ ± ∞ ¹ 196.2µ ± ∞ ¹ -3.60% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 205.7µ ± ∞ ¹ 194.9µ ± ∞ ¹ -5.24% (p=0.016 n=5) geomean 38.54µ 29.48µ -23.51% ¹ need >= 6 samples for confidence interval at level 0.95 │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ B/op │ B/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 6.148Ki ± ∞ ¹ 1.648Ki ± ∞ ¹ -73.19% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 3.062Ki ± ∞ ¹ 3.062Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 6.148Ki ± ∞ ¹ 1.648Ki ± ∞ ¹ -73.19% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 3.062Ki ± ∞ ¹ 3.062Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 112.74Ki ± ∞ ¹ 40.72Ki ± ∞ ¹ -63.88% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 90.67Ki ± ∞ ¹ 26.65Ki ± ∞ ¹ -70.61% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 90.75Ki ± ∞ ¹ 26.65Ki ± ∞ ¹ -70.64% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 112.85Ki ± ∞ ¹ 40.72Ki ± ∞ ¹ -63.92% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1368.0 ± ∞ ¹ 440.0 ± ∞ ¹ -67.84% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 1.188Ki ± ∞ ¹ 1.188Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 19.898Ki ± ∞ ¹ 5.398Ki ± ∞ ¹ -72.87% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 21.719Ki ± ∞ ¹ 8.219Ki ± ∞ ¹ -62.16% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 53.38Ki ± ∞ ¹ 13.49Ki ± ∞ ¹ -74.72% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 59.05Ki ± ∞ ¹ 21.04Ki ± ∞ ¹ -64.37% (p=0.008 n=5) geomean 1.266Ki 700.3 -46.00% ¹ need >= 6 samples for confidence interval at level 0.95 ² all samples are equal │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ allocs/op │ allocs/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 10.000 ± ∞ ¹ 8.000 ± ∞ ¹ -20.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 22.00 ± ∞ ¹ -26.67% (p=0.008 n=5) geomean 5.745 4.835 -15.84% ¹ need >= 6 samples for confidence interval at level 0.95 ² all samples are equal Signed-off-by: Marco Pracucci <[email protected]>
│ marco-pr-offsets.txt │ marco-pr-offsets-2.txt │ │ sec/op │ sec/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 20.82µ ± ∞ ¹ 20.74µ ± ∞ ¹ ~ (p=0.841 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 41.98µ ± ∞ ¹ 42.12µ ± ∞ ¹ ~ (p=0.690 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 42.75µ ± ∞ ¹ 42.64µ ± ∞ ¹ ~ (p=0.841 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 61.34µ ± ∞ ¹ 59.99µ ± ∞ ¹ -2.20% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 126.4µ ± ∞ ¹ 125.1µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 131.3µ ± ∞ ¹ 127.1µ ± ∞ ¹ -3.17% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 884.2µ ± ∞ ¹ 862.2µ ± ∞ ¹ ~ (p=0.151 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 875.8µ ± ∞ ¹ 865.1µ ± ∞ ¹ ~ (p=0.548 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 432.9µ ± ∞ ¹ 425.1µ ± ∞ ¹ ~ (p=0.056 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 401.7n ± ∞ ¹ 403.8n ± ∞ ¹ ~ (p=0.151 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1.016µ ± ∞ ¹ 1.053µ ± ∞ ¹ +3.64% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 1.678µ ± ∞ ¹ 1.680µ ± ∞ ¹ ~ (p=0.952 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 17.67µ ± ∞ ¹ 18.22µ ± ∞ ¹ +3.12% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 7.740µ ± ∞ ¹ 7.740µ ± ∞ ¹ ~ (p=0.889 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 16.43µ ± ∞ ¹ 17.18µ ± ∞ ¹ +4.56% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 196.2µ ± ∞ ¹ 193.8µ ± ∞ ¹ ~ (p=0.421 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 194.9µ ± ∞ ¹ 195.9µ ± ∞ ¹ ~ (p=0.421 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 64.67µ ± ∞ ¹ 63.62µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 2.977µ ± ∞ ¹ 2.927µ ± ∞ ¹ -1.68% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 6.351µ ± ∞ ¹ 6.402µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 7.907µ ± ∞ ¹ 7.274µ ± ∞ ¹ ~ (p=0.151 n=5) geomean 29.48µ 29.31µ -0.58% ¹ need >= 6 samples for confidence interval at level 0.95 Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
2dc28f5
to
5be5154
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but please check my comment. I think you can save some allocations :)
pkg/mimirpb/split.go
Outdated
|
||
newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) { | ||
r := &WriteRequest{ | ||
Timeseries: preallocSliceIfNeeded[PreallocTimeseries](preallocTimeseries), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to allocate Timeseries
slice, if we never write to that slice. We always replace it with slices from original request.
(Same comment applies in splitMetadataByMaxMarshalSize
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! I changed the logic while working on the benchmark, and then forgot to remove the preallocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 5126f03
|
||
// If the fields of WriteRequest haven't changed, then you will probably need to modify | ||
// the SplitWriteRequestByMaxMarshalSize() implementation accordingly! | ||
assert.ElementsMatch(t, []string{"Timeseries", "Source", "Metadata", "SkipLabelNameValidation"}, fieldNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice check :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the first piece of code ever I asked to write to chatgpt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the fields of WriteRequest haven't changed
I just realised there's a typo. I wanted to say..."If the fields of WriteRequest HAVE changed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 5126f03
Signed-off-by: Marco Pracucci <[email protected]>
pkg/mimirpb/split.go
Outdated
return nil | ||
} | ||
|
||
newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) { | |
newPartialReq := func() (*WriteRequest, int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous commit was done in a rush. I should have cleaned up unused code in b6c6443
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Signed-off-by: Marco Pracucci <[email protected]>
…ger than max allowed size (grafana#8077) * Split a per-partition WriteRequest into multiple Kafka records if bigger than max allowed size Fix partialReqSize reset Signed-off-by: Marco Pracucci <[email protected]> * Added BenchmarkWriteRequest_SplitByMaxMarshalSize Signed-off-by: Marco Pracucci <[email protected]> * Improved BenchmarkWriteRequest_SplitByMaxMarshalSize to take in account marshalling too Signed-off-by: Marco Pracucci <[email protected]> * Optimized implementation by reusing Timeseries and Metadata slice goos: darwin goarch: amd64 pkg: github.com/grafana/mimir/pkg/mimirpb cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ sec/op │ sec/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 3.846µ ± ∞ ¹ 2.977µ ± ∞ ¹ -22.59% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 10.481µ ± ∞ ¹ 6.351µ ± ∞ ¹ -39.40% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 10.504µ ± ∞ ¹ 7.907µ ± ∞ ¹ -24.72% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 21.64µ ± ∞ ¹ 20.82µ ± ∞ ¹ -3.79% (p=0.016 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 66.02µ ± ∞ ¹ 41.98µ ± ∞ ¹ -36.41% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 65.66µ ± ∞ ¹ 42.75µ ± ∞ ¹ -34.88% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 216.4µ ± ∞ ¹ 131.3µ ± ∞ ¹ -39.36% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 61.38µ ± ∞ ¹ 61.34µ ± ∞ ¹ ~ (p=1.000 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 202.4µ ± ∞ ¹ 126.4µ ± ∞ ¹ -37.55% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 429.7µ ± ∞ ¹ 432.9µ ± ∞ ¹ ~ (p=0.548 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 1325.3µ ± ∞ ¹ 884.2µ ± ∞ ¹ -33.28% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 1343.9µ ± ∞ ¹ 875.8µ ± ∞ ¹ -34.83% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 404.2n ± ∞ ¹ 401.7n ± ∞ ¹ -0.62% (p=0.016 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1.731µ ± ∞ ¹ 1.016µ ± ∞ ¹ -41.31% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 2.034µ ± ∞ ¹ 1.678µ ± ∞ ¹ -17.50% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 7.739µ ± ∞ ¹ 7.740µ ± ∞ ¹ ~ (p=1.000 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 28.10µ ± ∞ ¹ 16.43µ ± ∞ ¹ -41.53% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 29.45µ ± ∞ ¹ 17.67µ ± ∞ ¹ -40.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 64.74µ ± ∞ ¹ 64.67µ ± ∞ ¹ ~ (p=0.690 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 203.5µ ± ∞ ¹ 196.2µ ± ∞ ¹ -3.60% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 205.7µ ± ∞ ¹ 194.9µ ± ∞ ¹ -5.24% (p=0.016 n=5) geomean 38.54µ 29.48µ -23.51% ¹ need >= 6 samples for confidence interval at level 0.95 │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ B/op │ B/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 6.148Ki ± ∞ ¹ 1.648Ki ± ∞ ¹ -73.19% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 3.062Ki ± ∞ ¹ 3.062Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 6.148Ki ± ∞ ¹ 1.648Ki ± ∞ ¹ -73.19% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 3.062Ki ± ∞ ¹ 3.062Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 112.74Ki ± ∞ ¹ 40.72Ki ± ∞ ¹ -63.88% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 90.67Ki ± ∞ ¹ 26.65Ki ± ∞ ¹ -70.61% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 90.75Ki ± ∞ ¹ 26.65Ki ± ∞ ¹ -70.64% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 112.85Ki ± ∞ ¹ 40.72Ki ± ∞ ¹ -63.92% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1368.0 ± ∞ ¹ 440.0 ± ∞ ¹ -67.84% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 1.188Ki ± ∞ ¹ 1.188Ki ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 19.898Ki ± ∞ ¹ 5.398Ki ± ∞ ¹ -72.87% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 21.719Ki ± ∞ ¹ 8.219Ki ± ∞ ¹ -62.16% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 53.38Ki ± ∞ ¹ 13.49Ki ± ∞ ¹ -74.72% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 59.05Ki ± ∞ ¹ 21.04Ki ± ∞ ¹ -64.37% (p=0.008 n=5) geomean 1.266Ki 700.3 -46.00% ¹ need >= 6 samples for confidence interval at level 0.95 ² all samples are equal │ marco-pr-initial.txt │ marco-pr-offsets.txt │ │ allocs/op │ allocs/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 21.00 ± ∞ ¹ 21.00 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 7.000 ± ∞ ¹ 5.000 ± ∞ ¹ -28.57% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 30.00 ± ∞ ¹ 21.00 ± ∞ ¹ -30.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=5) ² WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 10.000 ± ∞ ¹ 8.000 ± ∞ ¹ -20.00% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 30.00 ± ∞ ¹ 22.00 ± ∞ ¹ -26.67% (p=0.008 n=5) geomean 5.745 4.835 -15.84% ¹ need >= 6 samples for confidence interval at level 0.95 ² all samples are equal Signed-off-by: Marco Pracucci <[email protected]> * Compute the actual partial request size instead of guessing it │ marco-pr-offsets.txt │ marco-pr-offsets-2.txt │ │ sec/op │ sec/op vs base │ WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12 20.82µ ± ∞ ¹ 20.74µ ± ∞ ¹ ~ (p=0.841 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 41.98µ ± ∞ ¹ 42.12µ ± ∞ ¹ ~ (p=0.690 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 42.75µ ± ∞ ¹ 42.64µ ± ∞ ¹ ~ (p=0.841 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12 61.34µ ± ∞ ¹ 59.99µ ± ∞ ¹ -2.20% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 126.4µ ± ∞ ¹ 125.1µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 131.3µ ± ∞ ¹ 127.1µ ± ∞ ¹ -3.17% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12 884.2µ ± ∞ ¹ 862.2µ ± ∞ ¹ ~ (p=0.151 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12 875.8µ ± ∞ ¹ 865.1µ ± ∞ ¹ ~ (p=0.548 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12 432.9µ ± ∞ ¹ 425.1µ ± ∞ ¹ ~ (p=0.056 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12 401.7n ± ∞ ¹ 403.8n ± ∞ ¹ ~ (p=0.151 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12 1.016µ ± ∞ ¹ 1.053µ ± ∞ ¹ +3.64% (p=0.008 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12 1.678µ ± ∞ ¹ 1.680µ ± ∞ ¹ ~ (p=0.952 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12 17.67µ ± ∞ ¹ 18.22µ ± ∞ ¹ +3.12% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12 7.740µ ± ∞ ¹ 7.740µ ± ∞ ¹ ~ (p=0.889 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12 16.43µ ± ∞ ¹ 17.18µ ± ∞ ¹ +4.56% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12 196.2µ ± ∞ ¹ 193.8µ ± ∞ ¹ ~ (p=0.421 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12 194.9µ ± ∞ ¹ 195.9µ ± ∞ ¹ ~ (p=0.421 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12 64.67µ ± ∞ ¹ 63.62µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12 2.977µ ± ∞ ¹ 2.927µ ± ∞ ¹ -1.68% (p=0.032 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12 6.351µ ± ∞ ¹ 6.402µ ± ∞ ¹ ~ (p=0.095 n=5) WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12 7.907µ ± ∞ ¹ 7.274µ ± ∞ ¹ ~ (p=0.151 n=5) geomean 29.48µ 29.31µ -0.58% ¹ need >= 6 samples for confidence interval at level 0.95 Signed-off-by: Marco Pracucci <[email protected]> * Added TestWriteRequest_SplitByMaxMarshalSize_Fuzzy Signed-off-by: Marco Pracucci <[email protected]> * Added BenchmarkWriteRequest_SplitByMaxMarshalSize_WithMarshalling Signed-off-by: Marco Pracucci <[email protected]> * Remove addressed TODO Signed-off-by: Marco Pracucci <[email protected]> * Fix linter issue Signed-off-by: Marco Pracucci <[email protected]> * Optimised the no splitting case Signed-off-by: Marco Pracucci <[email protected]> * Removed spurious files Signed-off-by: Marco Pracucci <[email protected]> * Tiny changes after a self code review Signed-off-by: Marco Pracucci <[email protected]> * Improved Writer tests Signed-off-by: Marco Pracucci <[email protected]> * Fixed linter issue Signed-off-by: Marco Pracucci <[email protected]> * Added a config option to configure the max record data size Signed-off-by: Marco Pracucci <[email protected]> * Address review comments Signed-off-by: Marco Pracucci <[email protected]> * Unused code cleanup Signed-off-by: Marco Pracucci <[email protected]> --------- Signed-off-by: Marco Pracucci <[email protected]>
What this PR does
When testing the experimental Kafka-based ingest storage, we've experienced an issue where the Kafka client rejected producing records because larger than the configured
ProducerBatchMaxBytes
(16MB).In this PR I'm adding a logic to split a single
mimirpb.WriteRequest
into multiple Kafka records if the marshalled request is bigger thanProducerBatchMaxBytes
(minus an overhead for the Kafka record). The wayWriter.WriteSync()
works in this PR is:Produce()
(the call toProduce()
doesn't block)Writer.WriteSync()
only after all records have been produced@pstibrany offered an alternative solution in #8167, which does the splitting starting from the marshalled version of
WriteRequest
. I personally think the cognitive load to learn how protobuf "parsing" work is higher than this PR, and a benchmark I've written doesn't show to be insanely faster doing it at binary level (results here).The splitting is a best-effort. If a single Timeseries or Metadata entry is bigger than the allowed max size (about 16MB) then the result record will still be rejected by Kafka. I want to treat such case as a client-side error, so that the client will not indefinitely try to push a WriteRequest which will consistently fail. To keep this PR smaller, I will do this change in a follow up PR.
I've also introduced a config option to allow to configure the max record data size. It's not intended to be changed in prod, but it will be useful to set (to a low value) in a test cluster to continuously stress test the splitting.
Benchmarks
I expect the splitting to trigger infrequently. For this reason, the most important benchmark to me is making sure there's no significant performance difference when no splitting is happening. We can see no impact on performance when no splitting is done here:
The next thing I want to highlight about performance, is the cost of marshalling the WriteRequest is way higher than the cost of splitting it (and we have to marshal the request even if there's no splitting to do), so I'm not too much concerned about the absolute speed of the marshalling function.
I've written a benchmark to show how much the unmarshalling/marshalling impacts compared to the mere splitting:
See comparison with and without marshalling
Which issue(s) this PR fixes or relates to
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.