Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improperly tagged log file due to race condition #185

Open
Bowbaq opened this issue Nov 1, 2016 · 6 comments
Open

Improperly tagged log file due to race condition #185

Bowbaq opened this issue Nov 1, 2016 · 6 comments
Milestone

Comments

@Bowbaq
Copy link
Contributor

Bowbaq commented Nov 1, 2016

Given a configuration like this:

files:
  - path: /path/to/log/specific.log
     name: fancy_tag
  - path: /path/to/log/*.log

Given that /path/to/log/specific.log does not exist on startup, the following race condition can happen:
race condition

The /path/to/log/specific.log file ends up matching the catch-all glob, and the tag defaults to the filename (ie. specific). On subsequent iterations of globFiles, the file is marked as already being tailed, so the tag is never updated.

I'm not entirely sure what the correct behavior should be. @snorecone thoughts?

@snorecone
Copy link
Contributor

Thanks for this @Bowbaq !

I think the globs should be resolved and de-duplicated on startup. #11 is the next issue I had planned to address, which would make the glob behavior more robust. I think solving this issue should be part of that. If you feel so inclined as to give it a shot, that would be great!

@Bowbaq
Copy link
Contributor Author

Bowbaq commented Nov 8, 2016

If globs are only resolved on startup, how does that work when a file gets created at a matching location after the daemon starts? It seems like it would get ignored, which reduces the usefulness of globs quite drastically.

@johlym
Copy link
Contributor

johlym commented Nov 8, 2016

@snorecone polling should take care of that ^, yeah?

@snorecone
Copy link
Contributor

Sorry, I wasn't very clear. I mean to say the globs should be resolved and de-duplicated on startup and every run of the file poller. If an explicit path is given, that tag should take precedence over any other tag given for a glob pattern.

@Bowbaq
Copy link
Contributor Author

Bowbaq commented Nov 8, 2016

The problem I'm running into is between two glob patterns though. One of them is more "specific" than the other, but I think that'd be hard to determine programmatically.

One possible solution is to say that globs that appear earlier in the config file have precedence over globs appearing later in the config file.

Another problem is that even if the tag was properly resolved on the next poll, you'd get ~1 poll period's worth of logs tagged with the wrong tag, and the rest with the good tag. A possible solution is to re-send whatever what previously mis-tagged (this would have a small overhead of duplicated logs).

@snorecone snorecone added this to the 0.20 milestone Nov 23, 2016
@snorecone
Copy link
Contributor

@Bowbaq if you could define what the specificity would be for file globs when determining the tag, this would be totally fixable without the problem of:

Another problem is that even if the tag was properly resolved on the next poll, you'd get ~1 poll period's worth of logs tagged with the wrong tag, and the rest with the good tag. A possible solution is to re-send whatever what previously mis-tagged (this would have a small overhead of duplicated logs).

When I say:

I mean to say the globs should be resolved and de-duplicated on startup and every run of the file poller. If an explicit path is given, that tag should take precedence over any other tag given for a glob pattern.

I mean that instead of iterating through the file globs once, it should be done twice: the first time to de-duplicate and determine tags, and the second time to start watching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants