Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gsym support #636

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-sgiesecke
Copy link

Fixes #632

@google-cla
Copy link

google-cla bot commented Jun 17, 2021

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Jun 17, 2021
@sfc-gh-sgiesecke
Copy link
Author

I'd like to add tests for that, but for this to work in CI, we would need to

  • install an up-to-date llvm-gsymutil there (from LLVM 13.0.0, which has not yet been released)
  • run llvm-gsymutil --convert on the test binary to produce a GSYM file
    I am not sure if that's feasible. WDYT?

@nolanmar511
Copy link
Contributor

nolanmar511 commented Jun 17, 2021

Per our CONTRIBUTING.md, two of the main things that are considered when deciding if a PR for a new feature should be accepted:

  • The tests added in the PR (to quote "All contributions should include automated tests for the change. We are continuously improving pprof automated testing and we can't accept changes that are not helping that direction.")
  • The ongoing maintenance cost relative to how broadly useful the change is to pprof users (to quote "We will also likely refuse to accept changes that have fairly limited audience but will require us to commit to maintain them for foreseeable future. This includes support for specific platforms, making internal pprof APIs public, etc.")

These things are important for us to consider because we have relatively limited capacity to support and maintain the pprof CLI. And untested code is particularly difficult to maintain (it takes a long time to identify that there's an issue, and that makes it much harder to root cause the issue).

I do not currently know if this feature is sufficiently broadly useful to accept into pprof. I do know that we're fairly reluctant to accept any PR which does not include tests. The tests within binutils demonstrate how we test other symbolization-related code. We do have a number of binaries and such checked into the testdata there. I believe internal/binutils/testdata/build_binaries.go is used to generate those, and I'd expect for it to be clear how anyone could update that testdata, but we don't require that the testdata there is generated in the unit testing environment (which might make installing and running llvm-gsymutil --convert more feasible).

@sfc-gh-sgiesecke
Copy link
Author

Thanks for the quick feedback!

These things are important for us to consider because we have relatively limited capacity to support and maintain the pprof CLI. And untested code is particularly difficult to maintain (it takes a long time to identify that there's an issue, and that makes it much harder to root cause the issue).

Sure, I am completely in line with that.

I do not currently know if this feature is sufficiently broadly useful to accept into pprof.

Given the emphasis that's put on the fact that addr2line is very slow (which it is, the binutils addr2line even more so than the LLVM version), it seems this should be pretty useful for many users of pprof? Note that while llvm-gsymutil is part of LLVM, it converts any DWARF debug information (e.g. produced by gcc), so its use is not limited to use cases where LLVM/clang is used for building.

I do know that we're fairly reluctant to accept any PR which does not include tests. The tests within binutils demonstrate how we test other symbolization-related code. We do have a number of binaries and such checked into the testdata there. I believe internal/binutils/testdata/build_binaries.go is used to generate those, and I'd expect for it to be clear how anyone could update that testdata, but we don't require that the testdata there is generated in the unit testing environment (which might make installing and running llvm-gsymutil --convert more feasible).

llvm-gsymutil is the utility both used for converting debug info to GSYM and for querying the resulting GSYM file. It may make sense to add a GSYM file to the repository (probably it does indeed for reproducability), but it won't spare making llvm-gsymutil available.

An alternative might be to provide a mock llvm-gsymutil for testing. WDYT?

@sfc-gh-sgiesecke
Copy link
Author

@nolanmar511 Have you got any suggestion on how to proceed?

@nolanmar511
Copy link
Contributor

nolanmar511 commented Jun 25, 2021

I would have some hesitations about the maintainability of a test case which uses a mock for llvm-gsymutil when there are no tests which confirm the integration WAI, since that means that there is a mock that needs to be maintained and (more notably, from my perspective) tests will all continue to silently pass if something changes in llvm-gsymutil which breaks pprof.

I'm also a bit concerned about adding support in pprof for something which is not yet released. I have a couple of reasons for this: first, I have a lower confidence that something which is unreleased will have a stable API (and, if the API isn't stable, that increases maintenance costs); next, for something which is unreleased, it'll be harder to understand if it has a broad audience (so, it's not as easy to know if support for llvm-gsymutil has a sufficiently broad audience among pprof users to justify the ongoing maintenance costs associated with adding support for it).

@aalexand would likely have a more definite opinion here.

@sfc-gh-sgiesecke
Copy link
Author

I think it would be a reasonable course of action to defer merging this PR until the next LLVM major version which should be due in October. I can implement tests that will only run with a custom-built LLVM until LLVM 13 is available.

Until then, we can use a patched version. I'd rather avoid the need for a permanent fork though.

if support for llvm-gsymutil has a sufficiently broad audience among pprof users

Not sure how to determine this. Part of this is kind of a hen-and-egg problem: As long as there's insufficient tool support for using GSYM files, there's no point in using llvm-gsymutil to produce them. I mentioned above why I think such support would be useful for a broad audience. If you think my reasoning is wrong, please give me some hint.

@@ -3,6 +3,7 @@ on:
push:
branches:
- master
- add-gsym-support
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a part of this PR.

@aalexand
Copy link
Collaborator

@sfc-gh-sgiesecke it would be nice if llvm-addr2line would transparently support the gsym format instead. It doesn't look right that tools like pprof need to dispatch on different file formats supported by LLVM.

@sfc-gh-sgiesecke
Copy link
Author

@sfc-gh-sgiesecke it would be nice if llvm-addr2line would transparently support the gsym format instead. It doesn't look right that tools like pprof need to dispatch on different file formats supported by LLVM.

I agree it would be nice to add support for GSYM to llvm-addr2line. I don't know why GSYM lookup wasn't added there in the first place. Probably having a single tool for handling GSYM files vs. having a single tool for address lookup is a hard choice.

But I guess that wouldn't resolve the need for some explicit support in pprof, since I can't imagine this would be supported transparently. The GSYM data isn't part of the binary, but always a separate file, and there's no debug-link mechanism like for DWARF. I guess it would be considered a breaking change to change the default to first look for a GSYM file before possible falling back to DWARF. So if pprof were to support that, it would need to be able to pass some extra command line option to addr2line. I am not 100% sure, but I don't think this is configurable right now?

@aalexand
Copy link
Collaborator

This seems similar to how *.dwp files are supported. There could be a default of looking up next to the binary or something. In any case, this is something that many tools will need to solve and I don't think the tools should deal with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for llvm-gsymutil
3 participants