-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck on PLANNING #791
Comments
I am also experiencing this frequently now on FreeBSD 14.0. Pretty sure it only started recently, maybe with the update to FreeBSD 14.0. I also haven't seen it on my Linux based systems yet, so it may be related to that FreeBSD update. |
We are also seeing this issue, we are on FreeBSD 14.0. Restarting zrepl works. |
This might be "better" than a full restart, as a work-around: |
JFYI, in my fork I implemented a timeout and it helped me a lot. Also, using a cron spec I configured some jobs with same ZFS datasets, so zrepl fires them at different time and they are not intersecting. |
At least for me this seems to be related to the number of parallel seize estimation steps, it no longer happened to me since I did this: replication:
concurrency:
size_estimates: 1
#size_estimates: 4
steps: 10 Edit to add: nope… changing that value solved it on a server, but on a different server it actually got things worse. |
It happens to me more often lately that the process is stuck on some PLANNING phases:
If I do signal reset, and then wake-up, it starts again and sometimes it works, while other times after a while it gets stuck again (on random filesystems apparently, not always the same).
How can I help debug this?
And, as a safeguard, would it be possible to have a watchdog… like, if a phase hasn't ended by 10 minutes, abort it and consider it failed for this round?
The text was updated successfully, but these errors were encountered: