Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handles multiple chunks in inter-thread communication #55

Open
greymd opened this issue Mar 11, 2023 · 1 comment
Open

Handles multiple chunks in inter-thread communication #55

greymd opened this issue Mar 11, 2023 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@greymd
Copy link
Owner

greymd commented Mar 11, 2023

teip degrades the performance rapidly if is has large number of small chunks.

$ yes | tr -d \\n | fold -w 1024 | TEIP_HIGHLIGHT="<{}>" teip -og . | head -n 1
<y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y><y>...
$ yes | tr -d \\n | fold -w 1024 | TEIP_HIGHLIGHT="<{}>" teip -og . | pv >/dev/null
.0MiB 0:00:05 [3.59MiB/s] [              <=>                                                                                                                                     ]
$ yes | tr -d \\n | fold -w 1024 | TEIP_HIGHLIGHT="<{}>" teip -og '.{64}' | pv >/dev/null
30MiB 0:00:04 [31.7MiB/s] [           <=>                                                                                                                                        ]

I believe that this performance degration is caused by the large mount of inter-thread communication.
Actually, futex occupies large ratio of execution time.

$ yes | tr -d \\n | fold -w 1024 | head -n 1024 | TEIP_HIGHLIGHT="<{}>" strace -cf teip -og .
︙
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 92.52    0.582339           7     85927     34598 futex
  5.85    0.036801          36      1026           write
  0.67    0.004226           9       485           sched_yield
  0.49    0.003090          24       130           read
  0.35    0.002197           9       248           brk
  0.06    0.000409         102         4           mmap
  0.01    0.000083          10         8           rt_sigprocmask
  0.01    0.000055          11         5           rt_sigaction
  0.01    0.000047           8         6           sigaltstack
  0.01    0.000039          13         3           munmap
  0.00    0.000028           9         3           mprotect
  0.00    0.000020          20         1           clone
  0.00    0.000016          16         1           poll
  0.00    0.000014          14         1           getrandom
  0.00    0.000013          13         1           ioctl
  0.00    0.000013          13         1           arch_prctl
  0.00    0.000010          10         1           set_tid_address
  0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ----------------
100.00    0.629400                 87852     34598 total

To mitigate the issue, it is good idea to reduce the number of futex calls.
Currently, the PipeIntercepter sends single chunk by tx.send for each time.

https://github.com/greymd/teip/blob/v2.2.0/src/pipeintercepter.rs#L23-L53
https://github.com/greymd/teip/blob/v2.2.0/src/pipeintercepter.rs#L235

This part can be improved more.
If PipeIntercepter can bufferes chunks and sends it to other thread,
number of futex will be reduced and performance improves rapidly under such the particular situation likea above.

@greymd greymd added enhancement New feature or request help wanted Extra attention is needed labels Mar 11, 2023
@greymd
Copy link
Owner Author

greymd commented Mar 11, 2023

Put help wante label.
Currently, I do not have enough resources to address this issue.
Any PRs are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant