Stuck on PLANNING #791

lapo-luchini · 2024-05-20T05:59:29Z

It happens to me more often lately that the process is stuck on some PLANNING phases:

If I do signal reset, and then wake-up, it starts again and sometimes it works, while other times after a while it gets stuck again (on random filesystems apparently, not always the same).
How can I help debug this?
And, as a safeguard, would it be possible to have a watchdog… like, if a phase hasn't ended by 10 minutes, abort it and consider it failed for this round?

A1bi · 2024-06-04T09:00:17Z

I am also experiencing this frequently now on FreeBSD 14.0. Pretty sure it only started recently, maybe with the update to FreeBSD 14.0. I also haven't seen it on my Linux based systems yet, so it may be related to that FreeBSD update.

kapsel · 2024-06-04T09:29:40Z

We are also seeing this issue, we are on FreeBSD 14.0. Restarting zrepl works.

lapo-luchini · 2024-06-05T13:20:15Z

This might be "better" than a full restart, as a work-around:
sudo zrepl signal reset <name>

dsh2dsh · 2024-06-05T13:38:31Z

JFYI, in my fork I implemented a timeout and it helped me a lot. Also, using a cron spec I configured some jobs with same ZFS datasets, so zrepl fires them at different time and they are not intersecting.

lapo-luchini · 2024-06-10T07:34:47Z

Wow @dsh2dsh there's a lot of work in your fork! Why it doesn't show up on github.com as a fork?
(that makes it more difficult to inspect the diff between the two projects)
@problame any chance any of that work will be included?
Would be nice to have a new official release. :)

lapo-luchini · 2024-06-11T13:09:51Z

At least for me this seems to be related to the number of parallel seize estimation steps, it no longer happened to me since I did this:

  replication:
    concurrency:
      size_estimates: 1
      #size_estimates: 4
      steps: 10

Edit to add: nope… changing that value solved it on a server, but on a different server it actually got things worse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck on PLANNING #791

Stuck on PLANNING #791

lapo-luchini commented May 20, 2024

A1bi commented Jun 4, 2024

kapsel commented Jun 4, 2024

lapo-luchini commented Jun 5, 2024

dsh2dsh commented Jun 5, 2024

lapo-luchini commented Jun 10, 2024

lapo-luchini commented Jun 11, 2024 •

edited

Stuck on PLANNING #791

Stuck on PLANNING #791

Comments

lapo-luchini commented May 20, 2024

A1bi commented Jun 4, 2024

kapsel commented Jun 4, 2024

lapo-luchini commented Jun 5, 2024

dsh2dsh commented Jun 5, 2024

lapo-luchini commented Jun 10, 2024

lapo-luchini commented Jun 11, 2024 • edited

lapo-luchini commented Jun 11, 2024 •

edited