Differing scaling factors across the same benchmark misrepresented in comaprisons. #236

loonatick-src · 2024-03-19T10:52:52Z

OS: macOS 14.4 Sonoma
Swift version: swift 5.11-dev DEVELOPMENT-SNAPSHOT-2023-12-07-a
package-benchmark version: 1.22.3
I refactored a benchmark by adding a non-unit scaling factor to reduce variance, i.e. from this

Benchmark("my benchmark") { benchmark in
      ...
}

to

Benchmark("my benchmark", configuration: .init(scalingFactor: .kilo)) { benchmark in
    for _ in benchmark.scaledIterations {
        ...
    }
}

This shows up like so when using swift package benchmark baseline compare

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│         Time (wall clock) (μs) *         │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   prev                   │     487 │     522 │     526 │     530 │     539 │     565 │     667 │   10000 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                 current                  │  514414 │  516686 │  518259 │  519045 │  521929 │  543344 │  543344 │      20 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │  513927 │  516164 │  517733 │  518515 │  521390 │  542779 │  542677 │   -9980 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │ -105529 │  -98882 │  -98428 │  -97833 │  -96733 │  -96067 │  -81361 │   -9980 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│             Malloc (total) *             │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│                   prev                   │       0 │       0 │       0 │       0 │       0 │       0 │       0 │   10000 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                 current                  │       0 │       0 │       0 │       0 │       0 │       0 │       0 │      20 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │       0 │       0 │       0 │       0 │       0 │       0 │       0 │   -9980 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │       0 │       0 │       0 │       0 │       0 │       0 │       0 │   -9980 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

It appears that the new benchmark is not scaled and this caused some confusion. Is this the intended behavior?

The text was updated successfully, but these errors were encountered:

hassila · 2024-03-27T16:20:17Z

Thanks for the report, unfortunately away for a couple of weeks and won't have time to check on this until then, sorry for delay.

hassila · 2024-04-20T08:53:50Z

Had a Quick Look at this now, this is not expected - will have a look at it, in the meantime generating a new baseline with the new factor should make things work.

hassila · 2024-04-20T08:54:48Z

Note to self:
result.timeUnits = base.timeUnits is invalid, we should instead scale the percentile results to base when they differ.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differing scaling factors across the same benchmark misrepresented in comaprisons. #236

Differing scaling factors across the same benchmark misrepresented in comaprisons. #236

loonatick-src commented Mar 19, 2024

hassila commented Mar 27, 2024

hassila commented Apr 20, 2024

hassila commented Apr 20, 2024

Differing scaling factors across the same benchmark misrepresented in comaprisons. #236

Differing scaling factors across the same benchmark misrepresented in comaprisons. #236

Comments

loonatick-src commented Mar 19, 2024

hassila commented Mar 27, 2024

hassila commented Apr 20, 2024

hassila commented Apr 20, 2024