Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations list #307

Open
alexey-yarmosh opened this issue Mar 28, 2023 · 0 comments
Open

Performance optimizations list #307

alexey-yarmosh opened this issue Mar 28, 2023 · 0 comments
Labels
info Useful information inside performance

Comments

@alexey-yarmosh
Copy link
Member

Here are the current performance values:

Measurement median T initial (ms) max T initial (ms) median T after clustering+fS (ms) max T after clustering+fS (ms)
100-probes-5-rps-240-duration 117.9 793 172.5 519
100-probes-6-rps-240-duration 267.8 2094 183.1 504
100-probes-7-rps-240-duration 497.8 46646 190.6 877
100-probes-8-rps-240-duration 2186.8 35850 194.4 1012
100-probes-10-rps-240-duration 202.4 1396
100-probes-15-rps-240-duration 2059.5 16607
100-probes-20-rps-240-duration 7407.5 32402

At the point of 15-20 rps we can see constant Probe disconnected. (reason: transport close) (API failed to send ping in time), Probe disconnected. (reason: ping timeout) (probe failed to response with pong in time) and Probe disconnected. (reason: transport error).

Here are all performance tuning ideas. They are different in terms of ease of implementation and perf gain, but we are listing all of them:

  • find an optimal number of cluster workers (currently dedi CPU is not overloaded, while node.js process is)
  • use uWS instead of socket.io (which should be more performant and handle high load I/O effetively)
  • tune UV_THREADPOOL_SIZE (thread pool is used for fs and dns operations)
  • cache ipv4 check during validation
  • stream the response to the user (mtr 500 probes measurement === 3MB of JSON)
  • change http framework to e.g. fastify (optimized http validation, logging)
  • we can move some of the logic from node to redis lua scripts (e.g. marking of measurement as 'finished' based on the 'probes_awaiting' field value)
  • deal with the I/O generated by NewRelic's metrics and logs (at least measure how much load it generates)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info Useful information inside performance
Projects
None yet
Development

No branches or pull requests

2 participants