Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread sampling hanging with debug CRT? #177

Open
uucidl opened this issue May 12, 2021 · 3 comments
Open

Thread sampling hanging with debug CRT? #177

uucidl opened this issue May 12, 2021 · 3 comments

Comments

@uucidl
Copy link

uucidl commented May 12, 2021

Hi,

I know it's silly to profile a program linked with the debug CRT, but I do sometimes use Remotery simply to see what's going on in our app, and that means sometimes in Debug. I noticed when upgrading to the latest Remotery that with the thread sampler on, I end up with deadlocks.

When this happens I'm in CheckForStallingSamples and which allocates (via rmtMalloc) ; we usually end up with a NtWaitForAlertByThreadId (from ucrtbased.dll!__acrt_lock, called from ucrtbased.dll!malloc).

Once this happens to our main thread, we basically hang.

For now I'm going to disable the thread sampler by default when building our debug builds, however I feel this should at least be documented in Remotery's readme. Or an even better solution.

@dwilliamson
Copy link
Collaborator

So if the thread that's been interrupted has any form of lock that Remotery can try to acquire, this will fail. That's totally not ideal and really, Remotery shouldn't be doing that much work while a thread is suspended, anyway.

I consider this a genuine bug and not behaviour to be aware of.

The whole sample tree needs to be copied when a stalling sample is detected so I think the solution here is to have an object allocator specific to the sampling thread that is pre-allocated with enough samples for a worst-case sample tree.

@uucidl
Copy link
Author

uucidl commented May 12, 2021

Glad to hear! Yeah it feels more healthy not to use the CRT allocator in that code path.

By the way, I also noticed something when reviewing that part of the code. Above the call to CheckForStallingSamples you mention that SuspendThread is async and that the guarantee it has completed is that we called GetThreadContext before. However, the call to rmtGetUserModeThreadContext only happens if sample_count == 0 , at least that's what I understood.

@dwilliamson
Copy link
Collaborator

Yes, there is a chance for this to fail. It's miniscule and hasn't shown in testing yet but the chance is there, nonetheless.

As far as I can tell, there is no problem with moving this into the branch but this is multi-threaded programming, i.e. I'm probably wrong. The ideal route appears to be to suspend the thread only when sample_count is zero but that looks like it will require some testing.

tom-seddon added a commit to tom-seddon/b2 that referenced this issue Jun 25, 2021
Remotery is now off by default, as it's of minority interest and
performance seems slightly better without.

Remotery's thread sampler is off by default too - same rationale.
See also Celtoys/Remotery#177.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants