Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem tracking child after libc::fork #80

Open
s1341 opened this issue Jul 26, 2022 · 15 comments
Open

problem tracking child after libc::fork #80

s1341 opened this issue Jul 26, 2022 · 15 comments

Comments

@s1341
Copy link

s1341 commented Jul 26, 2022

It seems that there are issues following libc::forks, no output file is created for the child process.

@koute
Copy link
Owner

koute commented Jul 26, 2022

This is expected and is not a bug.

Commenting out this line will probably make it work.

@s1341
Copy link
Author

s1341 commented Jul 26, 2022

Ok. Thanks. Perhaps it should be noted somewhere?

@koute
Copy link
Owner

koute commented Jul 26, 2022

Well, ideally it'd probably be the best to just have an env var to configure this behavior.

You don't want it to follow fork by default since that would in a lot of cases generate a lot of garbage files when the application you're profiling temporarily calls into subcommand, but I suppose in some cases you'd want to enable it.

@s1341
Copy link
Author

s1341 commented Jul 26, 2022

I can make a pr to add an env var. What is the format of existing env vars?

@koute
Copy link
Owner

koute commented Jul 26, 2022

I can make a pr to add an env var. What is the format of existing env vars?

https://github.com/koute/bytehound/blob/master/preload/src/opt.rs#L121

@s1341
Copy link
Author

s1341 commented Jul 26, 2022

hrm. Disabling that line did not work.

@koute
Copy link
Owner

koute commented Jul 27, 2022

Okay, wait, by "child after libc::fork" did you actually mean "after fork + exec" (which what was I assumed), or do you really mean after fork only? It will indeed not work if you want to profile forked processes without any exec. That is currently unsupported and would be significantly more complicated to implement than just uncommenting a single line.

@s1341
Copy link
Author

s1341 commented Jul 27, 2022

yeah. I mean fork without exec :(

What would I need to do to get it working for that?

@koute
Copy link
Owner

koute commented Jul 27, 2022

The whole internal state would have to be reset and essentially reinitialized: the queues and caches would have to be cleared, the global locks all forcibly recreated to make sure they're unlocked, the processing thread would have to be spawned again, a new output file would have to be opened, the shadow stack for fast unwinding would have to be flushed, etc. In theory it should be possible, but it's probably going to be quite a bit of work.

@s1341
Copy link
Author

s1341 commented Jul 27, 2022

Can you think of a quick workaround?

If not, I guess I could look into doing this work...

@s1341
Copy link
Author

s1341 commented Aug 1, 2022

Can you give me a little guidance on exactly what I need to do in order to support this scenario? Happy to make a PR, but I don't have time to review all the code...

@koute
Copy link
Owner

koute commented Aug 8, 2022

There's no workaround.

Unfortunately I don't really have the time to give a more detailed guidance, but if you want to try your hands at this feel free to do so. This is, however, a very tricky thing to get right. Here's what I'd do:

  1. Start by writing a test for this (see the integration-tests directory).
  2. Add a new environment variable to control this behavior.
  3. Go to where we hook into fork and after the fork has been called when running as a child most of the global state has to be reset and reinitialized (with a few exceptions, e.g. the options don't have to be reinitialized, etc.). Search for every static variable that we have. Now this is tricky because essentially fork might have been called at any time, so at a time some of the global locks might have been held, some of the state might have been in the process of being modified, etc. So for every piece of global state you'd need to decide whether it has to be reinitialized, and if so make sure it is reinitialized correctly in every circumstance. And what makes this tricky is that this is essentially impossible to test for in an deterministic manner, so while an integration test would ensure that this works in the most basic case it wouldn't exhaustively guarantee that it's correct.

@s1341
Copy link
Author

s1341 commented Aug 8, 2022

Ok.

Regarding 3: what about hooking fork and ensuring that the state is all in a 'good state' (i.e. nothing is being modified etc.) while we do the actual fork? Also: might it not be an idea to refactor such that there is only a single static variable?

@koute
Copy link
Owner

koute commented Aug 8, 2022

what about hooking fork and ensuring that the state is all in a 'good state' (i.e. nothing is being modified etc.) while we do the actual fork?

That's not possible unless we'd have a single global lock, but that would slow down the profiling to a crawl so it's not a realistic solution.

might it not be an idea to refactor such that there is only a single static variable?

That's also not entirely possible. Nevertheless, it wouldn't really change much, because you'd still have to reinitialize everything in an appropriate matter, regardless of whether it's a single static variable or two.

@s1341
Copy link
Author

s1341 commented Aug 8, 2022

Ok. I'll take a look at this in the next couple of days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants