-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[馃挕 FEATURE REQUEST]: Metrics for autoscaling workers #1741
Labels
C-feature-accepted
Category: Feature discussed and accepted
Comments
butschster
added
the
C-feature-request
Category: feature requested, but need to be discussed
label
Sep 29, 2023
rustatian
added
C-feature-accepted
Category: Feature discussed and accepted
and removed
C-feature-request
Category: feature requested, but need to be discussed
labels
Sep 29, 2023
rustatian
changed the title
[馃挕 FEATURE REQUEST]: Metrics for autoscaling in RoadRunner Jobs Plugin
[馃挕 FEATURE REQUEST]: Metrics for autoscaling workers
Oct 13, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Plugin
JOBS
I have an idea!
I've been working with the Jobs plugin and I am greatly appreciating the queue service it provides for PHP applications. It handles the communication with supported queue brokers efficiently, making job processing much easier.
With the upcoming release featuring the ability to manage the number of workers in the jobs worker pool using RPC, I believe there is an opportunity to enhance the functionality further by introducing metrics that can aid in the auto-scaling of workers. This would allow for more dynamic and efficient resource allocation, especially during fluctuating workloads.
Here are some of the metrics I suggest to be added for better monitoring and auto-scaling:
Broker Queue Length:
The number of queued tasks on the broker side awaiting processing. This can indicate when there is a buildup of tasks requiring more workers.
Local Queue Length:
The number of tasks that RR has fetched from the broker but haven't been processed yet.
Current Processing Rate (tasks/second):
The rate at which tasks are being processed. A real-time measurement of tasks being consumed per second can give insights into the current workload being handled by the workers.
Worker Utilization:
This metric could represent the percentage of active time versus idle time for each worker. High utilization may indicate a need for more workers.
Processing Time:
The average time it takes to process a task. This could help in determining the efficiency of the task processing setup.
Task Failure Rate:
The rate at which tasks are failing. A sudden increase in this metric might indicate a problem with the system or the tasks being processed.
Task Retries:
The number of retries required to process tasks. This could highlight issues with specific tasks or the processing environment.
Queue Broker Latency:
The time taken from when a task is placed on the broker queue to when it's picked up by a RR worker. High latency may indicate a bottleneck in fetching tasks from the queue.
These metrics will offer a clearer insight into the workload and processing efficiency, enabling auto-scaling of workers to be more accurate and responsive.
For instance, if there are no tasks in the queue on the broker side, it might be beneficial to decrease the number of worker processes. Conversely, during peak workloads with a high number of tasks in the queue, increasing the number of worker processes would be prudent.
Please feel free to share your thoughts, concerns, or any additional metrics that might be useful.
The text was updated successfully, but these errors were encountered: