Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[status-] decrease latency of status() when other threads run #2370

Merged
merged 1 commit into from May 23, 2024

Conversation

midichef
Copy link
Contributor

@midichef midichef commented Apr 4, 2024

While benchmarking, I noticed that vd.status() is slow to finish when vd is running other very busy threads (taking ~150ms). If there are no busy threads, it's much faster, (taking ~2ms).

By recording the run time in memory, I found that it's only a matter of latency. That is, inspect.stack() takes a lot of clock time to finish, around 150 milliseconds, but very little of that time is spent in the thread. I don't know why, but I assume stack() pauses to let the busy threads run.

It turns out inspect.stack() is slow because it reads source files. And it can be made much faster by not asking for lines of source file context.

Here is a comparison of the latency, before and after this PR.

Before:
Using inspect.stack(), while loading a 2 million line tsv file takes 150-190ms per call to status():
quit.vdj.txt

seq 2000111 > 2m.tsv
# load 2 million line tsv file then quit
vd -p quit.vdj
status() time spent in inspect.stack(), total time:
mean clock time:      0.152629 seconds
status() time spent in inspect.stack(), in thread:
mean time in thread:  0.002206 seconds

After:
using inspect.stack(0), it takes only 40-50ms:

vd -p quit.vdj
status() time spent in inspect.stack(), total time:
mean clock time:      0.051474 seconds
status() time spent in inspect.stack(), in thread:
mean time in thread:  0.001276 seconds

See https://stackoverflow.com/a/17407257
Inspect is slow to look up context lines, so it's a bit faster
to use context=0. When other busy threads run, context=1 can
have high latency, more than 100ms. In that situation, context=0
causes much less latency.
Copy link
Owner

@saulpw saulpw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, thanks @midichef !

@anjakefala
Copy link
Collaborator

Wow, the speed increase is incredible.

@anjakefala anjakefala merged commit a6a1be8 into saulpw:develop May 23, 2024
13 checks passed
@ajkerrigan
Copy link
Collaborator

Chiming in late here just to say thanks for making and explaining this change. As a drive-by observer this was a fun TIL 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants