Q: Generic long running events? #213

JCash · 2022-06-02T10:11:28Z

I wish to be able to log events that span multiple frames. e.g. the loading of a resource in the game.

Example:
https://i.stack.imgur.com/odD1S.png

By the looks of it in the Remotery example image (int the Readme.md), the "Processor timeline" is something related to that, but I'm not sure?

Is this something that is currently supported?
If not, what would be the recommended way to do it (assuming it fits within the scope/vison of Remotery)

/Mathias

dwilliamson · 2022-06-02T10:29:37Z

It does support this but requires the thread sampler to be active in order to achieve this. It does so by:

Suspending individual threads.
Inspecting all active sample trees within a ThreadProfiler for each thread.
Verifying if these sample trees are "stalling" (CheckForStallingSamples).
Sends the uncompleted tree to the viewer as a "partial tree."

This only works on Windows as it requires implementations for:

rmtGetCurrentThreadId
rmtSuspendThread
rmtResumeThread

It is currently intermingled with the thread sampler, which you do not need for this stalling functionality. So two things would need to be done to get this working on other platforms:

Implement the above functions.
Allow stalled sample checking to function without thread sampling being active (which is far more complex).

dwilliamson · 2022-06-02T10:39:16Z

Worth mentioning that suspending the thread is not the most ideal way to check for in-progress sample trees. It's just the most convenient place for the check right now. The best solution would be a custom thread that wakes up every 30ms or so, checks for stalling trees and somehow copies them safely for communication as a "partial tree."

The current variable treeBeingModified is only safe to use because the thread that uses it is suspended. It's not a generic synchronisation primitive.

JCash · 2022-06-02T10:54:34Z

Thanks!
That's good to know, and it does indeed sound a bit complex.

I've mentioned another approach earlier, by "manually" adding info to remotery (if possible) (See #193).

E.g. sending a start event, with name+time+threadid, and then later on, sending the corresponding end event.

How feasible would that be do you think? (Or does it have the same restrictions you outlined above?)

dwilliamson · 2022-06-02T11:08:53Z

Right now Remotery builds a sample tree on the thread being sampled and only sends the complete sample tree to the viewer/log once the root sample is closed.

The question you need to ask is: are there any other sensible periods when sending that tree is possible? On the thread being instrumented, the only entry points we have are when you push and pop samples. In this case you can't use Pop because the long-running event hasn't been closed yet, so you must use Push. And you can't really send the entire tree on each Push as traffic would be huge. So you need some kind of stochastic model which wouldn't always give you the results you want.

Right now, on Windows, this works well but it's not the solution I want because it's not cross-platform and interferes with the debugger. I see the following possible options, with increasing complexity:

Port the thread suspend stalling checks to new platforms.
Add a new solution that uses a custom thread to check sampling status in a lockless manner and copy trees.
- Mostly easy as the tree is only ever added to, but has corner cases with aggregate/recursive samples.
Make instrumentation only drop events into a queue and not construct a tree on-the-fly.
- Remotery thread would be responsible for building the tree before sending.
- The Remotery thread would be able to check its active thread queues each time it wakes up for stalling samples.
Make instrumentation only drop events into a queue and not construct a tree on-the-fly.
- Events are streamed untouched to the viewer.
- The viewer is then responsible for building the tree.
- No need for a stalling sample checker as events are streamed live.

I like 4 the best but it relies on Javascript performance building the tree. I'm in the middle of a viewer performance overhaul and can't effectively measure the impact of 4 yet, but it will become clear soon. I expect both Chrome/Firefox to do well optimising the hot paths but I can't be sure.

Also with number 4, there is no sample tree on the server anymore so the Sample Tree API will break. Number 3 wouldn't have this issue.

Number 2 is probably the easiest win right now as I don't believe 1 possible on all platforms.

dwilliamson · 2022-06-02T11:11:18Z

I have plans to do 3 or 4 so that Remotery can be used for micro-benchmarking as well as the stuff we use it for. When I get around to that is another issue.

dwilliamson · 2022-06-02T11:18:51Z

An option for forks could be: add an API function that manually checks for stalling samples on the thread the function is called from and send the partial tree if it's stalling. Very low-tech but would get you what you want with minimal changes.

JCash · 2022-06-02T11:57:53Z

Thanks again for your full answers!
It sounds like number 4 is close to what I had in mind as well.

dwilliamson · 2022-06-02T12:11:53Z

Actually, just to confirm, because long-running events are supported on all platforms: is this question just about that? Because I'm answering the question about long-running events not showing up in the viewer until they complete.

You just rmt_BeginCPUSample and rmt_EndCPUSample as normal, ensuring they are the root of their trees. They will get sent when they close.

dwilliamson · 2022-06-02T12:21:30Z

Another option that would work for you: we add a new sample flag, RMTSF_SendToViewer (or something), that sends the partial tree to the viewer when you close that sample. That way you can apply this to your loading samples and get them sent to the viewer on close, without having to make them root samples.

JCash · 2022-06-02T12:37:15Z

Actually, just to confirm, because long-running events are supported on all platforms: is this question just about that?

I think so yes.

You just rmt_BeginCPUSample and rmt_EndCPUSample as normal, ensuring they are the root of their trees. They will get sent when they close.

I'm not entirely sure what I would need to do here?
Code-wise, i'm in the middle of the callstack, with multiple rmt_BeginCPUSample's active.

dwilliamson · 2022-06-02T12:42:38Z

I might be able to sort something out...

dwilliamson · 2022-06-02T15:01:15Z

Can you try fadba34? I haven't tested it.

Mark whatever loading samples you have as RMTSF_SendOnClose and they should get sent. The flag comments are important for extra context.

JCash · 2022-06-02T15:54:23Z

Thanks!
I tried it but I couldn't make it work (perhaps I did it wrong):

In its simplest form, it crashes due to the recursive depth (87000) in the SampleTree_CopySample function.

rmt_BeginCPUSample(TestName, RMTSF_SendOnClose);
    dmTime::Sleep(10000);
rmt_EndCPUSample();

Also, I think I need to clarify further what I'm after code wise.
A little bit of pseudo code:

void UpdatePendingResources(Context* ctx)
{
    rmt_BeginCPUSample(__FUNCTION__, RMTSF_Aggregate);

    for (item in ctx->NewWork()) // 0..x number of items
    {
        // start the profile scope for each unique resource ("file1.txt", "file2.txt" etc)
        uint32_t profile_scope_id = rmt_BeginLongSample(item->Path(), some_flags);
        item->StartWork(profile_scope_id);
    }

    for (item in ctx->ProcessWork())
    {
        // 0..x number of items will be done per frame
        // Closing the scopes will not be done in the same order as they were started.

        if (item->IsDone())
        {
            // close the profile scope of this item
            uint32_t profile_scope_id = item->GetProfileScope();
            rmt_EndLongSample(profile_scope_id);
        }
    }

    rmt_EndCPUSample();
}

dwilliamson · 2022-06-02T15:58:40Z

Oh, yes... I'm not sure how I'd even render that kind of information unless each resource had its own timeline as everything overlaps. You'd get situations where some resources would completely hide other resources underneath it. The implementation is certainly not set up right now as well to close samples out of order.

dwilliamson · 2022-06-02T16:12:15Z

You'd need something like this

JCash · 2022-06-02T16:32:02Z

Yeah, I was a bit uncertain if there was some support for it.
Sorry for taking up your time.

I'd chalk this down to being a feature request then :)

dwilliamson · 2022-06-02T19:05:52Z

I will remove the new flag for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q: Generic long running events? #213

Q: Generic long running events? #213

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

Q: Generic long running events? #213

Q: Generic long running events? #213

Comments

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 • edited

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 • edited

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022 • edited

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 • edited

dwilliamson commented Jun 2, 2022

JCash commented Jun 2, 2022

dwilliamson commented Jun 2, 2022

dwilliamson commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022 •

edited

JCash commented Jun 2, 2022 •

edited

dwilliamson commented Jun 2, 2022 •

edited