feat: improve perf of groupMessagesForPartition (~600x in production use-case) #1576

benvan · 2023-05-13T00:08:45Z

This PR is the result of diagnosing high cpu usage on our ECS cluster.

It rewrites the src/producer/groupMessagesPerPartition.js implementation to be O(n)
In practice, this results in a significant performance improvement.

In our use-case, we regularly have blocks of many thousands of messages (20k was the scenario that led to this diagnostic).

I'll lead with the goods:

The following runs execute groupMessagesPerPartition with the specified number of messages.

The tests were run on an ECS Fargate instance with 2vCPU allocated.

Running for 10 messages:
old: 0.93ms
new: 0.49ms
improvement: 1.9x

Running for 50 messages:
old: 0.75ms
new: 0.20ms
improvement: 3.7x

Running for 100 messages:
old: 0.63ms
new: 0.21ms
improvement: 2.9x

Running for 500 messages:
old: 1.13ms
new: 0.085ms
improvement: 13.3x

Running for 1000 messages:
old: 4.73ms
new: 0.097ms
improvement: 48.6x

Running for 10000 messages:
old: 135.59ms
new: 3.33ms
improvement: 40.6x

Running for 20000 messages:
old: 1946.62ms
new: 3.23ms
improvement: 602.0x

Running for 50000 messages:
old: 16917.32ms
new: 0.97ms
improvement: 17415.6x

feat: improve perf of groupMessagesForPartition

fa724d8

siimsams approved these changes Jun 20, 2023

View reviewed changes

Saeger approved these changes Aug 14, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve perf of groupMessagesForPartition (~600x in production use-case) #1576

feat: improve perf of groupMessagesForPartition (~600x in production use-case) #1576

benvan commented May 13, 2023 •

edited

feat: improve perf of groupMessagesForPartition (~600x in production use-case) #1576

Are you sure you want to change the base?

feat: improve perf of groupMessagesForPartition (~600x in production use-case) #1576

Conversation

benvan commented May 13, 2023 • edited

benvan commented May 13, 2023 •

edited