-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When a worker pod is killed, no mechanism for retrying task #765
Comments
I face the same issue here. |
1 similar comment
I face the same issue here. |
Do we have any updates or workarounds against this? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We're using Redis broker + DyanmoDB backend, and we've noticed that when a worker pod is terminated (ungracefully) and the task was still running, the task stays in
STARTED
state. It seems as though Machinery doesn't have a timeout at which point it we re-queue tasks that have been inSTARTED
state for a long period of time. This seems like a critical feature for fault tolerance.The text was updated successfully, but these errors were encountered: