Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pruning old snapshots #2

Open
ngharo opened this issue Feb 23, 2020 · 8 comments · May be fixed by #3
Open

Pruning old snapshots #2

ngharo opened this issue Feb 23, 2020 · 8 comments · May be fixed by #3
Labels
enhancement New feature or request

Comments

@ngharo
Copy link
Contributor

ngharo commented Feb 23, 2020

I modeled a program closely after borg backup prune command to clean up old zackup snapshots on my server.

I'm considering adding a zackup prune command. It would read the config for daily, weekly, monthly and yearly "keep counts" and prune snapshots accordingly. I want to make sure that this is a feature you all would accept before I start working on the integration.

I can elaborate on more details if interested.

@corny
Copy link
Member

corny commented Feb 24, 2020

Oh yes, please provide some details.

@dmke
Copy link
Member

dmke commented Feb 25, 2020

@ngharo, yes, this is a very much missing feature that fell off my TODO list.

I'd like to see the retention policy to be configured per host, with defaults inherited by the global config (pretty much as the rsync and ssh config propagates).

I don't know whether the zackup prune command should have CLI flags to configure the retention on the fly... I believe it should execute the truncation/pruning according to the plan laid out by the global/host config. (I foresee accidental deleting the wrong data when juggling the command line arguments — which is really bad when that data is already the backup...)

When thinking about this feature, I've written down some obstacles somewhere... Let me report back here when I've looked through my notes at work (tomorrow).

@ngharo
Copy link
Contributor Author

ngharo commented Feb 26, 2020

I'd like to see the retention policy to be configured per host, with defaults inherited by the global config (pretty much as the rsync and ssh config propagates).

Agreed! That is something I want to do.

The config I envisioned would look like:

ssh:
   ...
rsync:
   ...
retention:
   yearly: 5
   monthly: 6
   weekly: 4
   daily: 7

Each number describes the number of snapshots to keep at each given interval.

I don't know whether the zackup prune command should have CLI flags to configure the retention on the fly... I believe it should execute the truncation/pruning according to the plan laid out by the global/host config. (I foresee accidental deleting the wrong data when juggling the command line arguments — which is really bad when that data is already the backup...)

Also agree. I think there should be one source of truth, the config. The prune command would be for people not running as a daemon. As a daemon, like BackupPC, it probably would make sense to run the prune automatically when idle, maybe right after backups complete.

Originally, I wanted to port BackupPCs exponential expiration over, but I'm having problems grokking it and I'm fairly new to the golang. Even as a user, I find it a little confusing and not sure if it's worth investing effort into vs a simplified approach where simple "keep" counts are used (again, modeled after borg backup pruning).

@dmke
Copy link
Member

dmke commented Feb 26, 2020

Each number describes the number of snapshots to keep at each given interval.

Ah, that's a nicer definition than mine: I had envisioned some kind of time bucket list in the form of

retention:
- { interval:  "24h", keep:  7 } # 7 daily backups
- { interval:  "48h", keep: 14 } # 14 bi-daily backups
- { interval:   "7d", keep:  4 } # 4 weekly backups
- { interval:  "30d", keep: 12 } # 12 monthly backups
# for the "rest", either:
- { interval: "360d", keep: ∞ } # keep the rest with one-year gaps
# or
- { interval: "360d", keep: 10 } # 10 yearly backups, delete anything older

where interval is fed into a time.Parse equivalent which interprets 1d as 24h, allowing for arbitrary buckets. Having predefined buckets makes both the configuration and implementation much easier.

Sidenote

This also allows upgrading the definition (should this ever be needed), as your config example can be easily re-modeled as

retention:
- { interval: "365d", keep: 5 } # == yearly: 5
- { interval:  "30d", keep: 6 } # == monthly: 6
- { interval:   "7d", keep: 4 } # == weekly: 4
- { interval:  "24h", keep: 7 } # == daily: 7

it probably would make sense to run the prune automatically when idle, maybe right after backups complete.

I concur. Parallel creating new snapshots and deleting old ones while an rsync is happening sounds like a lot of load for the ZFS ARC, which should be avoided.


Two notes I have found:

  1. How are the retention buckets stacked?

They can either be consecutive (i.e. bucket i+1 starts after bucket i ends), or they all start simultaneously. The latter is easier to implement, but leads (using your config from above), to the phenomenon that the weekly: 4 bucket is actually only 3 weeks long, because the first week is occupied by the daily: 7 bucket. The former leads to shifting each bucket further in time (the yearly: 5 bucket would actually cover a time range of more than 5½ years):

bucket-stacking

(This is just a matter of definition+documentation. There's no right or wrong here.)

  1. How do we handle rotating a snapshot from one bucket to the next?

This is a purely algorithmic problem: matching a list of snapshots (with creation timestamp) to the bucket-list. I've matched a drawing to your configuration (same color scheme as above):

bucket-aging

  • Here, we start with 6 daily backups (a, b, c, d, e and f).
  • 1d later we create backup g. The oldest daily backup (a) is not yet in the weekly bucket.
  • That happens the next day (2d on the y axis), where a "rolls into" the next bucket.
  • At that point the first weekly-bucket is empty, so a stays.
  • On day 3, we create backup i, and b rolls into the first weekly-bucket (which is still occupied by a). So b gets deleted.
  • This continues until day 9, where a rolls into the 2nd weekly-bucket and frees the 1st bucket for h.

I might have overlooked something, but this should also cover the case when backups are created more than once daily (the scale is just smaller).

Rolling from the weekly-bucket into the monthly-bucket applies the same principle.

It should also gracefully handle the case, where a backup is missing (which would be represented as a "hole" in the drawing).

@ngharo ngharo linked a pull request Feb 28, 2020 that will close this issue
@ngharo
Copy link
Contributor Author

ngharo commented Feb 28, 2020

Wow! Thanks for the feedback.

Let me know what you think of #3 so far. You can see how it wouldn't allow for arbitrary time durations from the user and how all buckets start simultaneously. It's really stupid simple (maybe too simple...). It's a straight port of how borg backup does pruning. I thought it was really clever use of time string formatting.

@dmke dmke added the enhancement New feature or request label Mar 2, 2020
@dmke dmke changed the title Feature: Pruning old snapshots Pruning old snapshots Mar 2, 2020
@dmke
Copy link
Member

dmke commented Mar 11, 2020

@ngharo, how's it coming? Do you need help?

@ngharo
Copy link
Contributor Author

ngharo commented Mar 24, 2020

Hey @dmke. I haven't had a lot of time to sit down and focus on this. Crazy days we're living in. Hope to get back in to it soon.

Hope you and yours are doing well

@dmke
Copy link
Member

dmke commented Mar 24, 2020

Crazy days indeed. Don't worry too much about this project, it's not important at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants