Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich Merlin telemetry with GC info #1680

Closed
3 tasks
pitag-ha opened this issue Sep 21, 2023 · 1 comment
Closed
3 tasks

Enrich Merlin telemetry with GC info #1680

pitag-ha opened this issue Sep 21, 2023 · 1 comment

Comments

@pitag-ha
Copy link
Member

For our performance analysis, it would be very useful to enhance the Merlin telemetry.

Current telemetry

Currently, the Merlin telemetry only contains time information. That timing information is broken down into the various Merlin phases: time spent during the reader phase, the preprocessor phase, the typer phase etc.

Proposed telemetry additions

It would be extremely useful to add memory and general GC information to that. Concretely, when running an ocamlmerlin query, I'd love to see information about

  • New memory allocation, i.e. the sum of memory allocated throughout the query execution.
  • Total memory usage, i.e. how much memory was used in total by ocamlmerlin, including the cached memory from before. This can be captured in different ways. I think the total memory usage at its peak would be enough.
  • (Possibly more info about the GC's work, e.g. for how long it has been running during the query.)

This new telemetry should be similarly broken down into the various Merlin phases as the timing telemetry.

Motivation for telemetry additions

In general, for performance analysis, it's useful to have insight into the memory footprint. Concretely, I have three concrete use cases in mind:

  1. I'd like to optimize the time/space trade-off. See Merlin's time-space trade-off #1636 . This is the most important one.
  2. I'd like to know if the time performance outliers we sometimes observe have some correlation with memory bursts.
  3. Adding memory data to our continuous benchmarks might help us catch newly introduced memory leaks/ memory performance regressions.

Discussion on CLI details

Do we want this enriched telemetry always or only when opted in in some way? As a heads-up: I'm also enriching the telemetry in other ways, so having full telemetry always would be quite verbose.

@pitag-ha
Copy link
Member Author

Implemented in #1717

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant