Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(profiling): keep string cache data alive longer #2668

Merged
merged 15 commits into from
Jun 12, 2024

Conversation

morrisonlevi
Copy link
Collaborator

@morrisonlevi morrisonlevi commented May 20, 2024

Description

PROF-9796

At the high-level, there are two wins here:

  1. The strings are no longer allocated with the system allocator, which was demonstrated to be slightly more efficient for both memory and CPU in libdatadog.
  2. The string set is no longer thrown away at the start of every request. Instead, we allow it to live to the next request if the total memory used is under some threshold.

Details

This uses the new allocators in libdatadog to build a StringSet, and drops the old string table from the code. Like the new StringTable in libdatadog v9, the string data for the StringSet is held in an arena. However, it doesn't track insert order, nor hand out ids. It returns references to interned data instead, like a regular set.

We store ThinStr<'a>s in the runtime cache, as usizes. This bit is unsafe because the lifetime of the cache doesn't fit nicely in the borrow checker's semantics. However, the implementation guarantees that the data lives long enough.

The run time cache is empty on each request, but the table data is still around. So we'll re-establish links to the cache to the data, but we don't have to rebuild the set structure as much compared to the previous implementation

If the allocators powering the string set report memory over 2 MiB at the end of a request, then the string set will start over. This particular number was chosen because it is the same size used for the chunk size of the chain allocator powering the set. The idea is that if we have more than one chunk, then maybe we're too big. We also don't want to look like a memory leak, and sizes over 2 MiB may start to look like we're leaking memory as we slowly use more and more.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@morrisonlevi morrisonlevi requested review from a team as code owners May 20, 2024 22:22
@morrisonlevi morrisonlevi marked this pull request as draft May 20, 2024 22:22
@morrisonlevi morrisonlevi changed the title feat: keep string cache data alive longer feat(profiling): keep string cache data alive longer May 20, 2024
@github-actions github-actions bot added the profiling Relates to the Continuous Profiler label May 20, 2024
@pr-commenter
Copy link

pr-commenter bot commented May 20, 2024

Benchmarks

Benchmark execution time: 2024-06-12 14:33:43

Comparing candidate commit c72a081 in PR branch levi/cache-longer with baseline commit 4bbc100 in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 26 metrics, 9 unstable metrics.

scenario:walk_stack/1

  • 🟩 wall_time [-469.453ns; -466.840ns] or [-3.738%; -3.718%]

We still have to re-establish the link in the run time cache, but the
string set itself will keep the sets and data alive.
@morrisonlevi
Copy link
Collaborator Author

morrisonlevi commented May 28, 2024

Now that the copy/paste issue and the unrelated sigsegvs have been fixed, this is looking much better. I plan to do some manual testing to be more confident, but I'm marking this as ready for review.

We should probably release a libdatadog version before merge, though, and use that rather than the git revision hash.

@morrisonlevi morrisonlevi marked this pull request as ready for review May 28, 2024 17:14
A slow ramp up to 4 MiB could _look_ like a memory leak. However, a
slow ramp to 2 MiB is probably going to look like it's within normal
operating ranges.
@morrisonlevi morrisonlevi merged commit 9f4a6a5 into master Jun 12, 2024
583 of 588 checks passed
@morrisonlevi morrisonlevi deleted the levi/cache-longer branch June 12, 2024 23:11
@github-actions github-actions bot added this to the 1.2.0 milestone Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the Continuous Profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants