-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appender: prune old files at startup #2966
base: master
Are you sure you want to change the base?
Conversation
The prune_old_logs function only gets called when we rollover the appender. The problem is that at startup we create an initial log file without rolling over, if e.g., the executable then exits before rolling over for the first time, we never call prune_old_logs. E.g.: If we have a tracing-appender that rolls over every minute and we have a max_files parameter of 2. Running the program (which prints a single log line and exits) every minute, will result in a new log file everytime without deleting the old ones. Fixes: tokio-rs#2937
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah cool, I've got an app that's affected by this bug — it's behind a scheduling proxy that starts and kills it according to demand, and it never gets to live for an entire log rollover cycle.
User expectations and correctness
So here's the range of cases, with problematic or unexpected behaviors in bold. "Long" lifetime means the app lives for many rotation periods in a row, "short" means it lives for less than a full rotation period.
App lifetime | First startup in a Rotation |
Subsequent startups in same Rotation |
Awake during rollover |
---|---|---|---|
Long (current) | Files can grow to max + 1 |
n/a | Files pruned to max |
Short (current) | Files can grow unbounded | Files unchanged | n/a |
Long (this PR) | Files pruned to max | n/a | Files pruned to max |
Short (this PR) | Files pruned to max | Files pruned to max - 1 |
n/a |
This is because:
- Current behavior never prunes unless a rollover hits while the app is running.
- This PR always prunes on startup.
- The
prune_old_logs
method unconditionally ensures that there's one fewer than max files, because it assumes it's being guarded by a "should roll over" check and we're about to create a new file and go back to max. (But for an app that might start and die many times within the sameRotation
period, that assumption is only correct for the first startup within a given Rotation.)
On consideration, I think it's fine to occasionally prune to max - 1
— after all, it's a maximum, not a guaranteed count. However, this needs an edit to documentation so users can calibrate their expectations — if you must retain N days of logs, then max = N + 1!
Performance concerns
Pruning in Inner::new()
will technically work, but it increases the amount of blocking filesystem work that has to be done when creating the writer.
What do you think about deferring the initial prune until it performs its first write after creation? That way, blocking-sensitive applications (like an edge function that needs fast cold startup time) can still take advantage of this crate's NonBlockingWriter to push that pruning cost off the critical path.
It looks like everything's already set up to do that properly, IF Inner::should_rollover
happens to return an affirmative on the first check. I think you can provoke this by changing line 531 of Inner::new()
to be let next_date = rotation.round_date(&now);
— the inner will still start out with the correct writer filename, but refresh_writer
will get called on the first write and will immediately re-open that correct file after doing a prune.
This of course will need a comment in the code to explain why the "next" date is starting out in the past.
Thanks for this thorough writeup!
This is kind of where my approach has a flaw. Because the prune_old_logs function expects to be called before refresh_writer (which creates a new file) it truncates the files to n-1 (as it is written in a comment in the function). BUT as you said, if we document this I would be fine with it.
This will call create_writer twice and do this (for two log lines):
instead of this:
And for a max_log_files of 2 it will never create more than 1 file. |
Yeah, gotta admit the extra thrash from doing create_writer twice on the first write isn't ideal. 🤨 I feel like there's got to be an elegant design for this, but ah well. At any rate, I'd accept "purge on startup" as long as it was accompanied by a docs change. Any performance hit is likely negligible. |
@nfagerlund Added a commit with the documentation update. I don't really like my explanation and wording, so please correct me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing 👍🏼
Co-authored-by: Nick Fagerlund <[email protected]>
Thanks! |
The prune_old_logs function only gets called when we rollover the appender. The problem is that at startup we create an initial log file without rolling over, if e.g., the executable then exits before rolling over for the first time, we never call prune_old_logs.
E.g.: If we have a tracing-appender that rolls over every minute and we have a max_files parameter of 2. Running the program (which prints a single log line and exits) every minute, will result in a new log file everytime without deleting the old ones.
Fixes: #2937