Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ecow::EcoVec for Record internals #12624

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

IanManske
Copy link
Member

Description

This PR changes the internal data structure in Record from a Vec to an ecow::EcoVec. The ecow crate is somewhat new and is not yet 1.0, but it's already seeing regular use through several popular Rust projects:

  • Typst (the owner of the crate and repository)
  • Gleam
  • Uiua

So, I'm not that worried about the crate potentially seeing little development in the future.

Rather, switching to an EcoVec has several benefits:

  • Removes one level of indirection from Value::Record. Previously, the record data was behind two levels of indirection (a Vec followed by a SharedCow).
  • EcoVecs are only 16 bytes in size, so this gives us the freedom to split the columns and values in Record again without increasing the current size of Value.
  • Simplified interface -- EcoVec has most of the methods that Vec does.
  • Slightly reduced memory usage (no need to store the Arc weak count).

Comment on lines +261 to +268
// pub fn drain<R>(&mut self, range: R) -> Drain
// where
// R: RangeBounds<usize> + Clone,
// {
// Drain {
// iter: self.inner.drain(range)
// }
// }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this is one of the methods that EcoVec doesn't yet support. I couldn't come up with an easy workaround, so I've just removed this method.

@devyn
Copy link
Contributor

devyn commented Apr 23, 2024

I think this would probably make SharedCow mostly unnecessary too, we can just use EcoVec for lists if we want to try that as well. Also gives us EcoString if we want to try to use that. I like the idea.

@sholderbach
Copy link
Member

Can you drop some numbers for the full suite? I think we may need some more record mutation heavy benchmarks than the pure growing.
(While the representation is more optimized, now if there is any atomic cost it will also appear in the operations where you just grow/do mutation inside pure Record code)

If we go that route we probably want to consider upstreaming some convenience APIs like retain and drain, to make sure the ecow crate gets the attention it deserves and ensure we get the soundness we want.

@IanManske
Copy link
Member Author

Sure thing, here are all the numbers (on my hardware). Top is main (bed2363), bottom is this PR.

Table
benchmarks                       fastest       │ slowest       │ median        │ mean          │ samples │ iters
benchmarks                       fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ load_standard_lib             12.68 ms      │ 19.11 ms      │ 12.88 ms      │ 13.33 ms      │ 100     │ 100
├─ load_standard_lib             12.54 ms      │ 22.36 ms      │ 12.76 ms      │ 13.36 ms      │ 100     │ 100
├─ decoding_benchmarks                         │               │               │               │         │
├─ decoding_benchmarks                         │               │               │               │         │
│  ├─ json_decode                              │               │               │               │         │
│  ├─ json_decode                              │               │               │               │         │
│  │  ├─ (100, 5)                255.2 µs      │ 338.5 µs      │ 263.8 µs      │ 266.3 µs      │ 100     │ 100
│  │  ├─ (100, 5)                269.1 µs      │ 349.5 µs      │ 272.2 µs      │ 277.5 µs      │ 100     │ 100
│  │  ╰─ (10000, 15)             75.91 ms      │ 85.35 ms      │ 76.87 ms      │ 77.21 ms      │ 100     │ 100
│  │  ╰─ (10000, 15)             74.73 ms      │ 80.12 ms      │ 75.79 ms      │ 76.03 ms      │ 100     │ 100
│  ╰─ msgpack_decode                           │               │               │               │         │
│  ╰─ msgpack_decode                           │               │               │               │         │
│     ├─ (100, 5)                140.1 µs      │ 220.3 µs      │ 141.2 µs      │ 143.7 µs      │ 100     │ 100
│     ├─ (100, 5)                144.1 µs      │ 234.8 µs      │ 147.1 µs      │ 148.9 µs      │ 100     │ 100
│     ╰─ (10000, 15)             37.69 ms      │ 41.94 ms      │ 37.92 ms      │ 38.26 ms      │ 100     │ 100
│     ╰─ (10000, 15)             39.57 ms      │ 50.68 ms      │ 39.9 ms       │ 40.25 ms      │ 100     │ 100
├─ encoding_benchmarks                         │               │               │               │         │
├─ encoding_benchmarks                         │               │               │               │         │
│  ├─ json_encode                              │               │               │               │         │
│  ├─ json_encode                              │               │               │               │         │
│  │  ├─ (100, 5)                92.25 µs      │ 104.9 µs      │ 93.2 µs       │ 93.91 µs      │ 100     │ 100
│  │  ├─ (100, 5)                98.33 µs      │ 125 µs        │ 99.35 µs      │ 100.5 µs      │ 100     │ 100
│  │  ╰─ (10000, 15)             24.93 ms      │ 27.03 ms      │ 25.13 ms      │ 25.25 ms      │ 100     │ 100
│  │  ╰─ (10000, 15)             26.55 ms      │ 28.5 ms       │ 26.68 ms      │ 26.89 ms      │ 100     │ 100
│  ╰─ msgpack_encode                           │               │               │               │         │
│  ╰─ msgpack_encode                           │               │               │               │         │
│     ├─ (100, 5)                61.55 µs      │ 72.32 µs      │ 62.44 µs      │ 63.11 µs      │ 100     │ 100
│     ├─ (100, 5)                62.13 µs      │ 71.26 µs      │ 62.84 µs      │ 63.69 µs      │ 100     │ 100
│     ╰─ (10000, 15)             16.09 ms      │ 18.3 ms       │ 16.17 ms      │ 16.25 ms      │ 100     │ 100
│     ╰─ (10000, 15)             16.21 ms      │ 18.47 ms      │ 16.28 ms      │ 16.41 ms      │ 100     │ 100
├─ eval_benchmarks                             │               │               │               │         │
├─ eval_benchmarks                             │               │               │               │         │
│  ├─ eval_default_config        2.677 ms      │ 3.009 ms      │ 2.718 ms      │ 2.739 ms      │ 100     │ 100
│  ├─ eval_default_config        2.737 ms      │ 3.09 ms       │ 2.756 ms      │ 2.769 ms      │ 100     │ 100
│  ╰─ eval_default_env           523.2 µs      │ 613 µs        │ 540.4 µs      │ 541.1 µs      │ 100     │ 100
│  ╰─ eval_default_env           522.2 µs      │ 629.8 µs      │ 536.1 µs      │ 538.2 µs      │ 100     │ 100
├─ eval_commands                               │               │               │               │         │
├─ eval_commands                               │               │               │               │         │
│  ├─ each                                     │               │               │               │         │
│  ├─ each                                     │               │               │               │         │
│  │  ├─ 1                       80.9 µs       │ 146.8 µs      │ 127.5 µs      │ 125.1 µs      │ 100     │ 100
│  │  ├─ 1                       93.21 µs      │ 142.6 µs      │ 127.9 µs      │ 125.1 µs      │ 100     │ 100
│  │  ├─ 5                       286.3 µs      │ 354.3 µs      │ 338.5 µs      │ 332.4 µs      │ 100     │ 100
│  │  ├─ 5                       296.4 µs      │ 351.2 µs      │ 338.2 µs      │ 333.4 µs      │ 100     │ 100
│  │  ├─ 10                      544.8 µs      │ 632.9 µs      │ 587.1 µs      │ 585.6 µs      │ 100     │ 100
│  │  ├─ 10                      556 µs        │ 702.5 µs      │ 575 µs        │ 585.7 µs      │ 100     │ 100
│  │  ├─ 100                     4.877 ms      │ 5.491 ms      │ 5.296 ms      │ 5.273 ms      │ 100     │ 100
│  │  ├─ 100                     4.977 ms      │ 5.432 ms      │ 5.244 ms      │ 5.247 ms      │ 100     │ 100
│  │  ╰─ 1000                    51.18 ms      │ 53.01 ms      │ 52.13 ms      │ 52.05 ms      │ 100     │ 100
│  │  ╰─ 1000                    43.79 ms      │ 52.92 ms      │ 52.18 ms      │ 52.02 ms      │ 100     │ 100
│  ├─ for_range                                │               │               │               │         │
│  ├─ for_range                                │               │               │               │         │
│  │  ├─ 1                       82.45 µs      │ 149.1 µs      │ 132 µs        │ 129.1 µs      │ 100     │ 100
│  │  ├─ 1                       80.59 µs      │ 139.7 µs      │ 128.4 µs      │ 124.3 µs      │ 100     │ 100
│  │  ├─ 5                       306.5 µs      │ 382.5 µs      │ 338.1 µs      │ 334.9 µs      │ 100     │ 100
│  │  ├─ 5                       301.4 µs      │ 350.3 µs      │ 338.4 µs      │ 333.5 µs      │ 100     │ 100
│  │  ├─ 10                      536 µs        │ 608.9 µs      │ 572.3 µs      │ 581 µs        │ 100     │ 100
│  │  ├─ 10                      507.8 µs      │ 626.4 µs      │ 571.2 µs      │ 581.7 µs      │ 100     │ 100
│  │  ├─ 100                     4.751 ms      │ 5.372 ms      │ 5.085 ms      │ 5.077 ms      │ 100     │ 100
│  │  ├─ 100                     4.614 ms      │ 5.192 ms      │ 5.085 ms      │ 5.071 ms      │ 100     │ 100
│  │  ╰─ 1000                    49.17 ms      │ 52.42 ms      │ 49.99 ms      │ 50.07 ms      │ 100     │ 100
│  │  ╰─ 1000                    49.24 ms      │ 52.98 ms      │ 50.08 ms      │ 50.2 ms       │ 100     │ 100
│  ├─ interleave                               │               │               │               │         │
│  ├─ interleave                               │               │               │               │         │
│  │  ├─ 100                     382.7 µs      │ 640.4 µs      │ 400.6 µs      │ 410 µs        │ 100     │ 100
│  │  ├─ 100                     354.2 µs      │ 741 µs        │ 391.7 µs      │ 400.7 µs      │ 100     │ 100
│  │  ├─ 1000                    3.06 ms       │ 3.666 ms      │ 3.147 ms      │ 3.187 ms      │ 100     │ 100
│  │  ├─ 1000                    2.944 ms      │ 3.491 ms      │ 3.037 ms      │ 3.063 ms      │ 100     │ 100
│  │  ╰─ 10000                   29.76 ms      │ 35.91 ms      │ 30.44 ms      │ 30.96 ms      │ 100     │ 100
│  │  ╰─ 10000                   28.12 ms      │ 37.91 ms      │ 29.88 ms      │ 30.6 ms       │ 100     │ 100
│  ├─ interleave_with_ctrlc                    │               │               │               │         │
│  ├─ interleave_with_ctrlc                    │               │               │               │         │
│  │  ├─ 100                     377.1 µs      │ 495.1 µs      │ 396.7 µs      │ 403.3 µs      │ 100     │ 100
│  │  ├─ 100                     371.8 µs      │ 516.8 µs      │ 391.6 µs      │ 399.3 µs      │ 100     │ 100
│  │  ├─ 1000                    3.042 ms      │ 3.701 ms      │ 3.144 ms      │ 3.172 ms      │ 100     │ 100
│  │  ├─ 1000                    2.945 ms      │ 3.646 ms      │ 3.08 ms       │ 3.101 ms      │ 100     │ 100
│  │  ╰─ 10000                   29.54 ms      │ 36.69 ms      │ 30.31 ms      │ 30.83 ms      │ 100     │ 100
│  │  ╰─ 10000                   29.1 ms       │ 37.24 ms      │ 30.14 ms      │ 30.73 ms      │ 100     │ 100
│  ├─ par_each_1t                              │               │               │               │         │
│  ├─ par_each_1t                              │               │               │               │         │
│  │  ├─ 1                       97.59 µs      │ 206.2 µs      │ 143.5 µs      │ 143.8 µs      │ 100     │ 100
│  │  ├─ 1                       87.1 µs       │ 303.7 µs      │ 140.9 µs      │ 143.6 µs      │ 100     │ 100
│  │  ├─ 5                       314.2 µs      │ 549.7 µs      │ 423.6 µs      │ 437.7 µs      │ 100     │ 100
│  │  ├─ 5                       304.6 µs      │ 529.5 µs      │ 416.7 µs      │ 426.9 µs      │ 100     │ 100
│  │  ├─ 10                      567.9 µs      │ 762.6 µs      │ 654.8 µs      │ 654.5 µs      │ 100     │ 100
│  │  ├─ 10                      625.4 µs      │ 815.2 µs      │ 738.8 µs      │ 726.5 µs      │ 100     │ 100
│  │  ├─ 100                     5.262 ms      │ 5.631 ms      │ 5.487 ms      │ 5.47 ms       │ 100     │ 100
│  │  ├─ 100                     5.195 ms      │ 5.565 ms      │ 5.463 ms      │ 5.446 ms      │ 100     │ 100
│  │  ╰─ 1000                    51.92 ms      │ 53.05 ms      │ 52.83 ms      │ 52.82 ms      │ 100     │ 100
│  │  ╰─ 1000                    52.03 ms      │ 53.03 ms      │ 52.76 ms      │ 52.74 ms      │ 100     │ 100
│  ╰─ par_each_2t                              │               │               │               │         │
│  ╰─ par_each_2t                              │               │               │               │         │
│     ├─ 1                       100.2 µs      │ 420.9 µs      │ 148.9 µs      │ 152.3 µs      │ 100     │ 100
│     ├─ 1                       94.22 µs      │ 262.2 µs      │ 149.5 µs      │ 155.2 µs      │ 100     │ 100
│     ├─ 5                       205.2 µs      │ 343.7 µs      │ 256.9 µs      │ 259.6 µs      │ 100     │ 100
│     ├─ 5                       191.6 µs      │ 371.9 µs      │ 258.5 µs      │ 262.1 µs      │ 100     │ 100
│     ├─ 10                      340.4 µs      │ 599.7 µs      │ 492.1 µs      │ 468.6 µs      │ 100     │ 100
│     ├─ 10                      316.2 µs      │ 483.2 µs      │ 391.9 µs      │ 390.3 µs      │ 100     │ 100
│     ├─ 100                     2.717 ms      │ 2.912 ms      │ 2.844 ms      │ 2.825 ms      │ 100     │ 100
│     ├─ 100                     2.689 ms      │ 2.911 ms      │ 2.794 ms      │ 2.804 ms      │ 100     │ 100
│     ╰─ 1000                    25.66 ms      │ 26.91 ms      │ 26.52 ms      │ 26.5 ms       │ 100     │ 100
│     ╰─ 1000                    26.14 ms      │ 26.62 ms      │ 26.52 ms      │ 26.49 ms      │ 100     │ 100
├─ parser_benchmarks                           │               │               │               │         │
├─ parser_benchmarks                           │               │               │               │         │
│  ├─ parse_default_config_file  2.25 ms       │ 2.613 ms      │ 2.284 ms      │ 2.301 ms      │ 100     │ 100
│  ├─ parse_default_config_file  2.217 ms      │ 4.562 ms      │ 2.29 ms       │ 2.522 ms      │ 100     │ 100
│  ╰─ parse_default_env_file     404.6 µs      │ 502.5 µs      │ 418.9 µs      │ 421.9 µs      │ 100     │ 100
│  ╰─ parse_default_env_file     403.8 µs      │ 550.4 µs      │ 452.4 µs      │ 448.4 µs      │ 100     │ 100
├─ record                                      │               │               │               │         │
├─ record                                      │               │               │               │         │
│  ├─ create                                   │               │               │               │         │
│  ├─ create                                   │               │               │               │         │
│  │  ├─ 1                       35.12 µs      │ 68.91 µs      │ 35.82 µs      │ 36.54 µs      │ 100     │ 100
│  │  ├─ 1                       35.09 µs      │ 87.57 µs      │ 35.86 µs      │ 37.88 µs      │ 100     │ 100
│  │  ├─ 10                      52.47 µs      │ 83.25 µs      │ 53.38 µs      │ 53.88 µs      │ 100     │ 100
│  │  ├─ 10                      51.87 µs      │ 79.7 µs       │ 53.03 µs      │ 53.7 µs       │ 100     │ 100
│  │  ├─ 100                     211.8 µs      │ 262 µs        │ 216.9 µs      │ 220.7 µs      │ 100     │ 100
│  │  ├─ 100                     215.5 µs      │ 241.7 µs      │ 218.2 µs      │ 219.3 µs      │ 100     │ 100
│  │  ╰─ 1000                    1.843 ms      │ 2.753 ms      │ 1.879 ms      │ 1.929 ms      │ 100     │ 100
│  │  ╰─ 1000                    1.859 ms      │ 2.514 ms      │ 1.889 ms      │ 1.952 ms      │ 100     │ 100
│  ├─ flat_access                              │               │               │               │         │
│  ├─ flat_access                              │               │               │               │         │
│  │  ├─ 1                       19.68 µs      │ 42.5 µs       │ 20.22 µs      │ 20.7 µs       │ 100     │ 100
│  │  ├─ 1                       16.88 µs      │ 39.06 µs      │ 17.37 µs      │ 17.72 µs      │ 100     │ 100
│  │  ├─ 10                      20.77 µs      │ 44.65 µs      │ 21.41 µs      │ 21.83 µs      │ 100     │ 100
│  │  ├─ 10                      20.94 µs      │ 40.34 µs      │ 21.56 µs      │ 21.97 µs      │ 100     │ 100
│  │  ├─ 100                     27.66 µs      │ 48.13 µs      │ 28.39 µs      │ 28.9 µs       │ 100     │ 100
│  │  ├─ 100                     25.44 µs      │ 45.56 µs      │ 25.89 µs      │ 26.27 µs      │ 100     │ 100
│  │  ╰─ 1000                    107.8 µs      │ 147.1 µs      │ 110.3 µs      │ 111.6 µs      │ 100     │ 100
│  │  ╰─ 1000                    110.6 µs      │ 136.9 µs      │ 112.8 µs      │ 116.3 µs      │ 100     │ 100
│  ╰─ nest_access                              │               │               │               │         │
│  ╰─ nest_access                              │               │               │               │         │
│     ├─ 1                       17.27 µs      │ 36.62 µs      │ 17.73 µs      │ 18.13 µs      │ 100     │ 100
│     ├─ 1                       17.1 µs       │ 50.16 µs      │ 17.5 µs       │ 18 µs         │ 100     │ 100
│     ├─ 2                       20.4 µs       │ 45.6 µs       │ 21.12 µs      │ 21.5 µs       │ 100     │ 100
│     ├─ 2                       20.92 µs      │ 40.2 µs       │ 21.53 µs      │ 21.85 µs      │ 100     │ 100
│     ├─ 4                       19.16 µs      │ 43.9 µs       │ 19.59 µs      │ 20.31 µs      │ 100     │ 100
│     ├─ 4                       18.62 µs      │ 41.13 µs      │ 19.18 µs      │ 19.55 µs      │ 100     │ 100
│     ├─ 8                       24.3 µs       │ 44.55 µs      │ 24.9 µs       │ 25.18 µs      │ 100     │ 100
│     ├─ 8                       20.53 µs      │ 40.12 µs      │ 21 µs         │ 21.51 µs      │ 100     │ 100
│     ├─ 16                      29.13 µs      │ 61.84 µs      │ 34.7 µs       │ 40.53 µs      │ 100     │ 100
│     ├─ 16                      24.03 µs      │ 46.2 µs       │ 24.54 µs      │ 24.94 µs      │ 100     │ 100
│     ├─ 32                      36.34 µs      │ 83.06 µs      │ 37.76 µs      │ 40.86 µs      │ 100     │ 100
│     ├─ 32                      34.24 µs      │ 57.97 µs      │ 34.8 µs       │ 35.15 µs      │ 100     │ 100
│     ├─ 64                      43.99 µs      │ 71.47 µs      │ 47.38 µs      │ 47.62 µs      │ 100     │ 100
│     ├─ 64                      42.51 µs      │ 67.28 µs      │ 45.58 µs      │ 45.84 µs      │ 100     │ 100
│     ╰─ 128                     67.8 µs       │ 94.23 µs      │ 70.66 µs      │ 70.96 µs      │ 100     │ 100
│     ╰─ 128                     64.23 µs      │ 90.69 µs      │ 65.58 µs      │ 67.05 µs      │ 100     │ 100
╰─ table                                       │               │               │               │         │
╰─ table                                       │               │               │               │         │
   ├─ create                                   │               │               │               │         │
   ├─ create                                   │               │               │               │         │
   │  ├─ 1                       41.09 µs      │ 73.85 µs      │ 41.78 µs      │ 42.36 µs      │ 100     │ 100
   │  ├─ 1                       41.45 µs      │ 104.6 µs      │ 46.58 µs      │ 47.43 µs      │ 100     │ 100
   │  ├─ 10                      60.93 µs      │ 92.67 µs      │ 61.93 µs      │ 62.69 µs      │ 100     │ 100
   │  ├─ 10                      63.64 µs      │ 119.5 µs      │ 65.26 µs      │ 68.09 µs      │ 100     │ 100
   │  ├─ 100                     248.2 µs      │ 283.2 µs      │ 250.4 µs      │ 251.8 µs      │ 100     │ 100
   │  ├─ 100                     256.8 µs      │ 292.6 µs      │ 259.3 µs      │ 260.7 µs      │ 100     │ 100
   │  ╰─ 1000                    2.132 ms      │ 2.714 ms      │ 2.167 ms      │ 2.195 ms      │ 100     │ 100
   │  ╰─ 1000                    2.198 ms      │ 2.633 ms      │ 2.217 ms      │ 2.247 ms      │ 100     │ 100
   ├─ get                                      │               │               │               │         │
   ├─ get                                      │               │               │               │         │
   │  ├─ 1                       25.43 µs      │ 60.33 µs      │ 26.04 µs      │ 26.79 µs      │ 100     │ 100
   │  ├─ 1                       28.07 µs      │ 57.56 µs      │ 28.82 µs      │ 29.33 µs      │ 100     │ 100
   │  ├─ 10                      27.14 µs      │ 56.2 µs       │ 27.9 µs       │ 28.49 µs      │ 100     │ 100
   │  ├─ 10                      26.81 µs      │ 54.2 µs       │ 27.69 µs      │ 28.24 µs      │ 100     │ 100
   │  ├─ 100                     40.74 µs      │ 73.85 µs      │ 42 µs         │ 42.9 µs       │ 100     │ 100
   │  ├─ 100                     42.19 µs      │ 71.54 µs      │ 42.84 µs      │ 43.56 µs      │ 100     │ 100
   │  ╰─ 1000                    184.8 µs      │ 234.3 µs      │ 186.7 µs      │ 188.2 µs      │ 100     │ 100
   │  ╰─ 1000                    191.8 µs      │ 260.2 µs      │ 194.9 µs      │ 196.9 µs      │ 100     │ 100
   ╰─ select                                   │               │               │               │         │
   ╰─ select                                   │               │               │               │         │
      ├─ 1                       23.97 µs      │ 67.6 µs       │ 24.87 µs      │ 25.6 µs       │ 100     │ 100
      ├─ 1                       23.83 µs      │ 68.98 µs      │ 24.68 µs      │ 25.3 µs       │ 100     │ 100
      ├─ 10                      32.55 µs      │ 59.61 µs      │ 33.6 µs       │ 34.73 µs      │ 100     │ 100
      ├─ 10                      29.05 µs      │ 55.55 µs      │ 30.1 µs       │ 30.68 µs      │ 100     │ 100
      ├─ 100                     75.28 µs      │ 107.1 µs      │ 76.76 µs      │ 77.35 µs      │ 100     │ 100
      ├─ 100                     76.44 µs      │ 112.7 µs      │ 79.01 µs      │ 79.85 µs      │ 100     │ 100
      ╰─ 1000                    524.7 µs      │ 673.2 µs      │ 547.8 µs      │ 554.4 µs      │ 100     │ 100
      ╰─ 1000                    544.7 µs      │ 616.5 µs      │ 551.5 µs      │ 554.8 µs      │ 100     │ 100

Looks like there is a little bit of overhead for mutation (e.g., table > get traverses records mutably to std::mem::take a column value). The nested record access is a little faster for this PR as expected. Otherwise, everything else seems to be roughly equal.

@YizhePKU
Copy link
Contributor

YizhePKU commented Apr 27, 2024

An alternative to this proposal would be using persistent data structures, such as ones implemented in rpds or im. Compared to ecow, they use structual sharing to reduce the number of elements that needs cloning. As a bonus, they offer map-like data structures, which should enable faster access then the linear search we're doing right now.

Just curious, but why were we not using a map-like data structure in the first place?

@devyn
Copy link
Contributor

devyn commented Apr 27, 2024

I remember seeing in a comment that we were using im at some point. Based on what I remember hearing second-hand because wasn't a part of this at the time, we chose to go for the Vec-based implementation because most of the data nushell works on involves fairly small records that are often very short-lived. Just doing string comparison against < 10 short string keys is already quite fast without any extra optimization, and justifying the extra overhead of maintenance of a more complex data structure would require many more keys being compared. You can look at O(n) vs O(log(n)) and say the latter is definitely better, but it really depends on how big you expect n to get and how quick the individual steps are by comparison.

Immutable data structures also often have tradeoffs that make them perform worse for many operations, so there have to be benefits from the immutability (e.g. benefits gained from partial reuse of the data, on a large tree) to make up for it

@YizhePKU
Copy link
Contributor

Immutable data structures also often have tradeoffs that make them perform worse for many operations, so there have to be benefits from the immutability (e.g. benefits gained from partial reuse of the data, on a large tree) to make up for it.

That's true. Benchmarking is the only way to know, but it's an option worth considering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants