Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split a per-partition WriteRequest into multiple Kafka records if bigger than max allowed size #8077

Merged
merged 17 commits into from
May 31, 2024

Conversation

pracucci
Copy link
Collaborator

@pracucci pracucci commented May 7, 2024

What this PR does

When testing the experimental Kafka-based ingest storage, we've experienced an issue where the Kafka client rejected producing records because larger than the configured ProducerBatchMaxBytes (16MB).

In this PR I'm adding a logic to split a single mimirpb.WriteRequest into multiple Kafka records if the marshalled request is bigger than ProducerBatchMaxBytes (minus an overhead for the Kafka record). The way Writer.WriteSync() works in this PR is:

  1. Split the request into multiple records
  2. Concurrently call Kafka client Produce() (the call to Produce() doesn't block)
  3. Return from Writer.WriteSync() only after all records have been produced

@pstibrany offered an alternative solution in #8167, which does the splitting starting from the marshalled version of WriteRequest. I personally think the cognitive load to learn how protobuf "parsing" work is higher than this PR, and a benchmark I've written doesn't show to be insanely faster doing it at binary level (results here).

The splitting is a best-effort. If a single Timeseries or Metadata entry is bigger than the allowed max size (about 16MB) then the result record will still be rejected by Kafka. I want to treat such case as a client-side error, so that the client will not indefinitely try to push a WriteRequest which will consistently fail. To keep this PR smaller, I will do this change in a follow up PR.

I've also introduced a config option to allow to configure the max record data size. It's not intended to be changed in prod, but it will be useful to set (to a low value) in a test cluster to continuously stress test the splitting.

Benchmarks

I expect the splitting to trigger infrequently. For this reason, the most important benchmark to me is making sure there's no significant performance difference when no splitting is happening. We can see no impact on performance when no splitting is done here:

BenchmarkMarshalWriteRequestToRecords_NoSplitting/marshalWriteRequestToRecords()-12         	    1454	    797148 ns/op	  401587 B/op	       4 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/marshalWriteRequestToRecords()-12         	    1388	    813155 ns/op	  401584 B/op	       4 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/marshalWriteRequestToRecords()-12         	    1513	    773326 ns/op	  401584 B/op	       4 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/marshalWriteRequestToRecords()-12         	    1580	    828526 ns/op	  401585 B/op	       4 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/marshalWriteRequestToRecords()-12         	    1490	    783778 ns/op	  401585 B/op	       4 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/WriteRequest.Marshal()-12                 	    1489	    777842 ns/op	  401408 B/op	       1 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/WriteRequest.Marshal()-12                 	    1509	    843568 ns/op	  401409 B/op	       1 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/WriteRequest.Marshal()-12                 	    1398	    782053 ns/op	  401408 B/op	       1 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/WriteRequest.Marshal()-12                 	    1521	    781184 ns/op	  401408 B/op	       1 allocs/op
BenchmarkMarshalWriteRequestToRecords_NoSplitting/WriteRequest.Marshal()-12                 	    1382	    841626 ns/op	  401408 B/op	       1 allocs/op

The next thing I want to highlight about performance, is the cost of marshalling the WriteRequest is way higher than the cost of splitting it (and we have to marshal the request even if there's no splitting to do), so I'm not too much concerned about the absolute speed of the marshalling function.

I've written a benchmark to show how much the unmarshalling/marshalling impacts compared to the mere splitting:

go test -run '^$' -bench 'BenchmarkSplitWriteRequestByMaxMarshalSize' -benchmem -count 6 ./pkg/mimirpb > bench.txt
cat bench.txt | grep 'BenchmarkSplitWriteRequestByMaxMarshalSize/' > bench-without-marshalling.txt
cat bench.txt | grep 'BenchmarkSplitWriteRequestByMaxMarshalSize_WithMarshalling/' | sed 's/_WithMarshalling//g' > bench-with-marshalling.txt
benchstat bench-without-marshalling.txt bench-with-marshalling.txt
See comparison with and without marshalling
                                                                                                                              │ bench-without-marshalling.txt │       bench-with-marshalling.txt        │
                                                                                                                              │            sec/op             │     sec/op      vs base                 │
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                                7.741µ ±  3%   219.820µ ±  7%  +2739.87% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                                       17.41µ ±  3%    233.84µ ±  9%  +1243.05% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                                      18.25µ ±  7%    236.64µ ±  8%  +1196.50% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                                    65.26µ ±  3%   1552.97µ ±  8%  +2279.71% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                           197.7µ ±  4%    1675.6µ ±  6%   +747.70% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                          202.3µ ±  3%    1673.2µ ±  6%   +726.94% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                       6.785µ ±  5%    79.256µ ±  6%  +1068.18% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                      7.607µ ±  2%    83.043µ ±  6%   +991.67% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                                3.070µ ±  4%    71.792µ ±  2%  +2238.50% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                               21.61µ ±  6%    489.85µ ±  3%  +2166.73% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                      44.63µ ±  3%    517.72µ ±  3%  +1060.03% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                     45.04µ ±  4%    517.96µ ±  7%  +1050.03% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                     133.4µ ±  4%    1525.6µ ±  7%  +1043.28% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                               62.12µ ±  4%   1438.82µ ±  8%  +2216.04% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                      128.0µ ±  5%    1517.1µ ± 10%  +1085.50% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                    886.8µ ±  3%   10119.0µ ±  8%  +1041.01% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                              436.3µ ±  8%   10043.1µ ± 10%  +2201.77% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                     894.4µ ±  4%   10361.8µ ±  6%  +1058.49% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                                 409.8n ± 13%   11505.5n ±  4%  +2707.25% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                                        1.085µ ±  6%    12.502µ ±  8%  +1052.79% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                                       1.766µ ±  4%    13.528µ ±  7%   +666.02% (p=0.002 n=6)
geomean                                                                                                                                          30.28µ           432.3µ        +1327.46%

                                                                                                                              │ bench-without-marshalling.txt │           bench-with-marshalling.txt            │
                                                                                                                              │             B/op              │       B/op         vs base                      │
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                                  8.000 ± 0%     226479.500 ± 0%    +2830893.75% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                                       5.398Ki ± 0%      224.563Ki ± 0%       +4059.79% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                                      8.219Ki ± 0%      230.509Ki ± 0%       +2704.67% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                                      8.000 ± 0%    1932334.500 ± 0%   +24154081.25% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                           13.49Ki ± 0%      1908.60Ki ± 0%      +14045.96% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                          21.04Ki ± 0%      1932.13Ki ± 0%       +9083.52% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                       1.648Ki ± 0%       80.874Ki ± 0%       +4806.10% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                      3.062Ki ± 0%       82.036Ki ± 0%       +2578.73% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                                  8.000 ± 0%      82412.000 ± 0%    +1030050.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                                 8.000 ± 0%     810570.500 ± 0%   +10132031.25% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                      1.648Ki ± 0%      801.224Ki ± 0%      +48505.07% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                     3.062Ki ± 0%      791.147Ki ± 0%      +25733.39% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                     40.72Ki ± 0%      1585.04Ki ± 0%       +3792.66% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                                 8.000 ± 0%    1581346.500 ± 0%   +19766731.25% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                      26.65Ki ± 0%      1570.97Ki ± 0%       +5795.17% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                    40.72Ki ± 0%     15812.80Ki ± 0%      +38734.20% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                                8.000 ± 0%   16109291.500 ± 0%  +201366043.75% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                     26.65Ki ± 0%     15767.82Ki ± 0%      +59069.77% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                                   8.000 ± 0%      10880.000 ± 0%     +135900.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                                          440.0 ± 0%        11632.0 ± 0%       +2543.64% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                                       1.188Ki ± 0%       11.867Ki ± 0%        +899.34% (p=0.002 n=6)
geomean                                                                                                                                            700.3             499.2Ki           +72894.49%

                                                                                                                              │ bench-without-marshalling.txt │        bench-with-marshalling.txt        │
                                                                                                                              │           allocs/op           │   allocs/op    vs base                   │
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                                  1.000 ± 0%   4013.000 ± 0%  +401200.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                                         5.000 ± 0%   4018.000 ± 0%   +80260.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                                        21.00 ± 0%    4042.00 ± 0%   +19147.62% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                                      1.000 ± 0%   5022.000 ± 0%  +502100.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                             8.000 ± 0%   5031.000 ± 0%   +62787.50% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                            22.00 ± 0%    5052.00 ± 0%   +22863.64% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                         5.000 ± 0%    264.000 ± 0%    +5180.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                        21.00 ± 0%     288.00 ± 0%    +1271.43% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                                  1.000 ± 0%    259.000 ± 0%   +25800.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                                 1.000 ± 0%    409.000 ± 0%   +40800.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                        5.000 ± 0%    414.000 ± 0%    +8180.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                       21.00 ± 0%     438.00 ± 0%    +1985.71% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12                       21.00 ± 0%    5042.00 ± 0%   +23909.52% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                                 1.000 ± 0%   5013.000 ± 0%  +501200.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12                        5.000 ± 0%   5018.000 ± 0%  +100260.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12                      21.00 ± 0%    8044.00 ± 0%   +38204.76% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                                1.000 ± 0%   8014.000 ± 0%  +801300.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12                       5.000 ± 0%   8020.000 ± 0%  +160300.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                                   1.000 ± 0%    209.000 ± 0%   +20800.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                                          5.000 ± 0%    214.000 ± 0%    +4180.00% (p=0.002 n=6)
SplitWriteRequestByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                                         21.00 ± 0%     238.00 ± 0%    +1033.33% (p=0.002 n=6)
geomean                                                                                                                                            4.835          1.538k        +31716.77%

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this PR an early look, and it would work.

I wonder if it would have been easier to work at serialized-message level, splitting the message by fields with tag 1 (timeseries) and tag 3 (metadata) until they fill the size, while copying tag 2 (source) and 1000 (skip_label_name_validation) into each submessage.

}

// We assume that different timeseries roughly have the same size (no huge outliers)
// so we preallocate the returned slice just adding 1 extra item (+2 because a +1 is to round up).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could understand +1, but why +2 again?

Copy link
Collaborator Author

@pracucci pracucci May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to round up, and +1 for an extra item. The +1 round up doesn't guarantees us space for 1 extra item (it depends what was the reminder of the division), but it's guaranteed by the 2nd +2. Does this answer your question?

return []*WriteRequest{partialReq}
}

// We assume that different timeseries roughly have the same size (no huge outliers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that size of each timeseries is dominated by labels, I have doubts that this assumption holds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice we split into 16MB partial requests. At this scale, the size of labels shouldn't matter much.

pkg/mimirpb/custom.go Outdated Show resolved Hide resolved
Timeseries: timeseries,
Metadata: metadata,
Source: p.Source,
SkipLabelNameValidation: p.SkipLabelNameValidation,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to reviewers: not a bug today, because SkipLabelNameValidation is only read in distributors before calling ForIndexes but I think it's better to fix it.

@pracucci pracucci marked this pull request as ready for review May 30, 2024 11:23
@pracucci pracucci requested a review from a team as a code owner May 30, 2024 11:23
…ger than max allowed size

Fix partialReqSize reset

Signed-off-by: Marco Pracucci <[email protected]>
goos: darwin
goarch: amd64
pkg: github.com/grafana/mimir/pkg/mimirpb
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
                                                                                                                               │ marco-pr-initial.txt │        marco-pr-offsets.txt         │
                                                                                                                               │        sec/op        │    sec/op     vs base               │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                       3.846µ ± ∞ ¹   2.977µ ± ∞ ¹  -22.59% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             10.481µ ± ∞ ¹   6.351µ ± ∞ ¹  -39.40% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            10.504µ ± ∞ ¹   7.907µ ± ∞ ¹  -24.72% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                      21.64µ ± ∞ ¹   20.82µ ± ∞ ¹   -3.79% (p=0.016 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             66.02µ ± ∞ ¹   41.98µ ± ∞ ¹  -36.41% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            65.66µ ± ∞ ¹   42.75µ ± ∞ ¹  -34.88% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            216.4µ ± ∞ ¹   131.3µ ± ∞ ¹  -39.36% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                      61.38µ ± ∞ ¹   61.34µ ± ∞ ¹        ~ (p=1.000 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             202.4µ ± ∞ ¹   126.4µ ± ∞ ¹  -37.55% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                     429.7µ ± ∞ ¹   432.9µ ± ∞ ¹        ~ (p=0.548 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12           1325.3µ ± ∞ ¹   884.2µ ± ∞ ¹  -33.28% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12          1343.9µ ± ∞ ¹   875.8µ ± ∞ ¹  -34.83% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                        404.2n ± ∞ ¹   401.7n ± ∞ ¹   -0.62% (p=0.016 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1.731µ ± ∞ ¹   1.016µ ± ∞ ¹  -41.31% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                              2.034µ ± ∞ ¹   1.678µ ± ∞ ¹  -17.50% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                       7.739µ ± ∞ ¹   7.740µ ± ∞ ¹        ~ (p=1.000 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                              28.10µ ± ∞ ¹   16.43µ ± ∞ ¹  -41.53% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                             29.45µ ± ∞ ¹   17.67µ ± ∞ ¹  -40.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                           64.74µ ± ∞ ¹   64.67µ ± ∞ ¹        ~ (p=0.690 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  203.5µ ± ∞ ¹   196.2µ ± ∞ ¹   -3.60% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                 205.7µ ± ∞ ¹   194.9µ ± ∞ ¹   -5.24% (p=0.016 n=5)
geomean                                                                                                                                  38.54µ         29.48µ        -23.51%
¹ need >= 6 samples for confidence interval at level 0.95

                                                                                                                               │ marco-pr-initial.txt │          marco-pr-offsets.txt          │
                                                                                                                               │         B/op         │     B/op       vs base                 │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                        8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             6.148Ki ± ∞ ¹   1.648Ki ± ∞ ¹  -73.19% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            3.062Ki ± ∞ ¹   3.062Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                       8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12            6.148Ki ± ∞ ¹   1.648Ki ± ∞ ¹  -73.19% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12           3.062Ki ± ∞ ¹   3.062Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12          112.74Ki ± ∞ ¹   40.72Ki ± ∞ ¹  -63.88% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                       8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12            90.67Ki ± ∞ ¹   26.65Ki ± ∞ ¹  -70.61% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                      8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12           90.75Ki ± ∞ ¹   26.65Ki ± ∞ ¹  -70.64% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12         112.85Ki ± ∞ ¹   40.72Ki ± ∞ ¹  -63.92% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                         8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1368.0 ± ∞ ¹     440.0 ± ∞ ¹  -67.84% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                             1.188Ki ± ∞ ¹   1.188Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                        8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                            19.898Ki ± ∞ ¹   5.398Ki ± ∞ ¹  -72.87% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                           21.719Ki ± ∞ ¹   8.219Ki ± ∞ ¹  -62.16% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                            8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                 53.38Ki ± ∞ ¹   13.49Ki ± ∞ ¹  -74.72% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                59.05Ki ± ∞ ¹   21.04Ki ± ∞ ¹  -64.37% (p=0.008 n=5)
geomean                                                                                                                                 1.266Ki           700.3        -46.00%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

                                                                                                                               │ marco-pr-initial.txt │         marco-pr-offsets.txt         │
                                                                                                                               │      allocs/op       │  allocs/op   vs base                 │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12               7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12              21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12              7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12             21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12             30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12              7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                      1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                         1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                                7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                               21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                               7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                              30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                            1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  10.000 ± ∞ ¹   8.000 ± ∞ ¹  -20.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                  30.00 ± ∞ ¹   22.00 ± ∞ ¹  -26.67% (p=0.008 n=5)
geomean                                                                                                                                   5.745         4.835        -15.84%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

Signed-off-by: Marco Pracucci <[email protected]>
                                                                                                                              │ marco-pr-offsets.txt │       marco-pr-offsets-2.txt       │                                                                                                                               │        sec/op        │    sec/op     vs base              │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                      20.82µ ± ∞ ¹   20.74µ ± ∞ ¹       ~ (p=0.841 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             41.98µ ± ∞ ¹   42.12µ ± ∞ ¹       ~ (p=0.690 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            42.75µ ± ∞ ¹   42.64µ ± ∞ ¹       ~ (p=0.841 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                      61.34µ ± ∞ ¹   59.99µ ± ∞ ¹  -2.20% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             126.4µ ± ∞ ¹   125.1µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            131.3µ ± ∞ ¹   127.1µ ± ∞ ¹  -3.17% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12            884.2µ ± ∞ ¹   862.2µ ± ∞ ¹       ~ (p=0.151 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12           875.8µ ± ∞ ¹   865.1µ ± ∞ ¹       ~ (p=0.548 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                     432.9µ ± ∞ ¹   425.1µ ± ∞ ¹       ~ (p=0.056 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                        401.7n ± ∞ ¹   403.8n ± ∞ ¹       ~ (p=0.151 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1.016µ ± ∞ ¹   1.053µ ± ∞ ¹  +3.64% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                              1.678µ ± ∞ ¹   1.680µ ± ∞ ¹       ~ (p=0.952 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                             17.67µ ± ∞ ¹   18.22µ ± ∞ ¹  +3.12% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                       7.740µ ± ∞ ¹   7.740µ ± ∞ ¹       ~ (p=0.889 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                              16.43µ ± ∞ ¹   17.18µ ± ∞ ¹  +4.56% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  196.2µ ± ∞ ¹   193.8µ ± ∞ ¹       ~ (p=0.421 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                 194.9µ ± ∞ ¹   195.9µ ± ∞ ¹       ~ (p=0.421 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                           64.67µ ± ∞ ¹   63.62µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                       2.977µ ± ∞ ¹   2.927µ ± ∞ ¹  -1.68% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12              6.351µ ± ∞ ¹   6.402µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12             7.907µ ± ∞ ¹   7.274µ ± ∞ ¹       ~ (p=0.151 n=5)
geomean                                                                                                                                  29.48µ         29.31µ        -0.58%
¹ need >= 6 samples for confidence interval at level 0.95

Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
@pracucci pracucci requested a review from pstibrany May 31, 2024 04:15
@pracucci pracucci force-pushed the fix-max-write-request-size branch from 2dc28f5 to 5be5154 Compare May 31, 2024 04:18
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but please check my comment. I think you can save some allocations :)


newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) {
r := &WriteRequest{
Timeseries: preallocSliceIfNeeded[PreallocTimeseries](preallocTimeseries),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to allocate Timeseries slice, if we never write to that slice. We always replace it with slices from original request.

(Same comment applies in splitMetadataByMaxMarshalSize).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I changed the logic while working on the benchmark, and then forgot to remove the preallocation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5126f03


// If the fields of WriteRequest haven't changed, then you will probably need to modify
// the SplitWriteRequestByMaxMarshalSize() implementation accordingly!
assert.ElementsMatch(t, []string{"Timeseries", "Source", "Metadata", "SkipLabelNameValidation"}, fieldNames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice check :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the first piece of code ever I asked to write to chatgpt.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the fields of WriteRequest haven't changed

I just realised there's a typo. I wanted to say..."If the fields of WriteRequest HAVE changed".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5126f03

Signed-off-by: Marco Pracucci <[email protected]>
return nil
}

newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newPartialReq := func(preallocTimeseries int) (*WriteRequest, int) {
newPartialReq := func() (*WriteRequest, int) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous commit was done in a rush. I should have cleaned up unused code in b6c6443

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Signed-off-by: Marco Pracucci <[email protected]>
@pracucci pracucci merged commit da8ffbe into main May 31, 2024
29 checks passed
@pracucci pracucci deleted the fix-max-write-request-size branch May 31, 2024 10:18
narqo pushed a commit to narqo/grafana-mimir that referenced this pull request Jun 6, 2024
…ger than max allowed size (grafana#8077)

* Split a per-partition WriteRequest into multiple Kafka records if bigger than max allowed size
Fix partialReqSize reset

Signed-off-by: Marco Pracucci <[email protected]>

* Added BenchmarkWriteRequest_SplitByMaxMarshalSize

Signed-off-by: Marco Pracucci <[email protected]>

* Improved BenchmarkWriteRequest_SplitByMaxMarshalSize to take in account marshalling too

Signed-off-by: Marco Pracucci <[email protected]>

* Optimized implementation by reusing Timeseries and Metadata slice

goos: darwin
goarch: amd64
pkg: github.com/grafana/mimir/pkg/mimirpb
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
                                                                                                                               │ marco-pr-initial.txt │        marco-pr-offsets.txt         │
                                                                                                                               │        sec/op        │    sec/op     vs base               │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                       3.846µ ± ∞ ¹   2.977µ ± ∞ ¹  -22.59% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             10.481µ ± ∞ ¹   6.351µ ± ∞ ¹  -39.40% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            10.504µ ± ∞ ¹   7.907µ ± ∞ ¹  -24.72% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                      21.64µ ± ∞ ¹   20.82µ ± ∞ ¹   -3.79% (p=0.016 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             66.02µ ± ∞ ¹   41.98µ ± ∞ ¹  -36.41% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            65.66µ ± ∞ ¹   42.75µ ± ∞ ¹  -34.88% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            216.4µ ± ∞ ¹   131.3µ ± ∞ ¹  -39.36% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                      61.38µ ± ∞ ¹   61.34µ ± ∞ ¹        ~ (p=1.000 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             202.4µ ± ∞ ¹   126.4µ ± ∞ ¹  -37.55% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                     429.7µ ± ∞ ¹   432.9µ ± ∞ ¹        ~ (p=0.548 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12           1325.3µ ± ∞ ¹   884.2µ ± ∞ ¹  -33.28% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12          1343.9µ ± ∞ ¹   875.8µ ± ∞ ¹  -34.83% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                        404.2n ± ∞ ¹   401.7n ± ∞ ¹   -0.62% (p=0.016 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1.731µ ± ∞ ¹   1.016µ ± ∞ ¹  -41.31% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                              2.034µ ± ∞ ¹   1.678µ ± ∞ ¹  -17.50% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                       7.739µ ± ∞ ¹   7.740µ ± ∞ ¹        ~ (p=1.000 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                              28.10µ ± ∞ ¹   16.43µ ± ∞ ¹  -41.53% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                             29.45µ ± ∞ ¹   17.67µ ± ∞ ¹  -40.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                           64.74µ ± ∞ ¹   64.67µ ± ∞ ¹        ~ (p=0.690 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  203.5µ ± ∞ ¹   196.2µ ± ∞ ¹   -3.60% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                 205.7µ ± ∞ ¹   194.9µ ± ∞ ¹   -5.24% (p=0.016 n=5)
geomean                                                                                                                                  38.54µ         29.48µ        -23.51%
¹ need >= 6 samples for confidence interval at level 0.95

                                                                                                                               │ marco-pr-initial.txt │          marco-pr-offsets.txt          │
                                                                                                                               │         B/op         │     B/op       vs base                 │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                        8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             6.148Ki ± ∞ ¹   1.648Ki ± ∞ ¹  -73.19% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            3.062Ki ± ∞ ¹   3.062Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                       8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12            6.148Ki ± ∞ ¹   1.648Ki ± ∞ ¹  -73.19% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12           3.062Ki ± ∞ ¹   3.062Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12          112.74Ki ± ∞ ¹   40.72Ki ± ∞ ¹  -63.88% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                       8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12            90.67Ki ± ∞ ¹   26.65Ki ± ∞ ¹  -70.61% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                      8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12           90.75Ki ± ∞ ¹   26.65Ki ± ∞ ¹  -70.64% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12         112.85Ki ± ∞ ¹   40.72Ki ± ∞ ¹  -63.92% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                         8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1368.0 ± ∞ ¹     440.0 ± ∞ ¹  -67.84% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                             1.188Ki ± ∞ ¹   1.188Ki ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                        8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                            19.898Ki ± ∞ ¹   5.398Ki ± ∞ ¹  -72.87% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                           21.719Ki ± ∞ ¹   8.219Ki ± ∞ ¹  -62.16% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                            8.000 ± ∞ ¹     8.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                 53.38Ki ± ∞ ¹   13.49Ki ± ∞ ¹  -74.72% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                59.05Ki ± ∞ ¹   21.04Ki ± ∞ ¹  -64.37% (p=0.008 n=5)
geomean                                                                                                                                 1.266Ki           700.3        -46.00%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

                                                                                                                               │ marco-pr-initial.txt │         marco-pr-offsets.txt         │
                                                                                                                               │      allocs/op       │  allocs/op   vs base                 │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12               7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12              21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12              7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12             21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12             30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12              7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                      1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                         1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                                7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                               21.00 ± ∞ ¹   21.00 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                               7.000 ± ∞ ¹   5.000 ± ∞ ¹  -28.57% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                              30.00 ± ∞ ¹   21.00 ± ∞ ¹  -30.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                            1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=5) ²
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  10.000 ± ∞ ¹   8.000 ± ∞ ¹  -20.00% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                  30.00 ± ∞ ¹   22.00 ± ∞ ¹  -26.67% (p=0.008 n=5)
geomean                                                                                                                                   5.745         4.835        -15.84%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

Signed-off-by: Marco Pracucci <[email protected]>

* Compute the actual partial request size instead of guessing it
                                                                                                                              │ marco-pr-offsets.txt │       marco-pr-offsets-2.txt       │                                                                                                                               │        sec/op        │    sec/op     vs base              │
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/no_splitting-12                      20.82µ ± ∞ ¹   20.74µ ± ∞ ¹       ~ (p=0.841 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12             41.98µ ± ∞ ¹   42.12µ ± ∞ ¹       ~ (p=0.690 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12            42.75µ ± ∞ ¹   42.64µ ± ∞ ¹       ~ (p=0.841 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/no_splitting-12                      61.34µ ± ∞ ¹   59.99µ ± ∞ ¹  -2.20% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12             126.4µ ± ∞ ¹   125.1µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12            131.3µ ± ∞ ¹   127.1µ ± ∞ ¹  -3.17% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_few_requests-12            884.2µ ± ∞ ¹   862.2µ ± ∞ ¹       ~ (p=0.151 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/split_in_many_requests-12           875.8µ ± ∞ ¹   865.1µ ± ∞ ¹       ~ (p=0.548 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_series,_many_labels_each,_and_no_metadata/no_splitting-12                     432.9µ ± ∞ ¹   425.1µ ± ∞ ¹       ~ (p=0.056 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/no_splitting-12                                        401.7n ± ∞ ¹   403.8n ± ∞ ¹       ~ (p=0.151 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_few_requests-12                               1.016µ ± ∞ ¹   1.053µ ± ∞ ¹  +3.64% (p=0.008 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_metadata,_and_no_series/split_in_many_requests-12                              1.678µ ± ∞ ¹   1.680µ ± ∞ ¹       ~ (p=0.952 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_many_requests-12                             17.67µ ± ∞ ¹   18.22µ ± ∞ ¹  +3.12% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/no_splitting-12                                       7.740µ ± ∞ ¹   7.740µ ± ∞ ¹       ~ (p=0.889 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_many_metadata,_and_no_series/split_in_few_requests-12                              16.43µ ± ∞ ¹   17.18µ ± ∞ ¹  +4.56% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_few_requests-12                                  196.2µ ± ∞ ¹   193.8µ ± ∞ ¹       ~ (p=0.421 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/split_in_many_requests-12                                 194.9µ ± ∞ ¹   195.9µ ± ∞ ¹       ~ (p=0.421 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_both_series_and_metadata/no_splitting-12                                           64.67µ ± ∞ ¹   63.62µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/no_splitting-12                       2.977µ ± ∞ ¹   2.927µ ± ∞ ¹  -1.68% (p=0.032 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_few_requests-12              6.351µ ± ∞ ¹   6.402µ ± ∞ ¹       ~ (p=0.095 n=5)
WriteRequest_SplitByMaxMarshalSize/write_request_with_few_series,_few_labels_each,_and_no_metadata/split_in_many_requests-12             7.907µ ± ∞ ¹   7.274µ ± ∞ ¹       ~ (p=0.151 n=5)
geomean                                                                                                                                  29.48µ         29.31µ        -0.58%
¹ need >= 6 samples for confidence interval at level 0.95

Signed-off-by: Marco Pracucci <[email protected]>

* Added TestWriteRequest_SplitByMaxMarshalSize_Fuzzy

Signed-off-by: Marco Pracucci <[email protected]>

* Added BenchmarkWriteRequest_SplitByMaxMarshalSize_WithMarshalling

Signed-off-by: Marco Pracucci <[email protected]>

* Remove addressed TODO

Signed-off-by: Marco Pracucci <[email protected]>

* Fix linter issue

Signed-off-by: Marco Pracucci <[email protected]>

* Optimised the no splitting case

Signed-off-by: Marco Pracucci <[email protected]>

* Removed spurious files

Signed-off-by: Marco Pracucci <[email protected]>

* Tiny changes after a self code review

Signed-off-by: Marco Pracucci <[email protected]>

* Improved Writer tests

Signed-off-by: Marco Pracucci <[email protected]>

* Fixed linter issue

Signed-off-by: Marco Pracucci <[email protected]>

* Added a config option to configure the max record data size

Signed-off-by: Marco Pracucci <[email protected]>

* Address review comments

Signed-off-by: Marco Pracucci <[email protected]>

* Unused code cleanup

Signed-off-by: Marco Pracucci <[email protected]>

---------

Signed-off-by: Marco Pracucci <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants