Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ignore]: Support absolute paths in global ignore files (via add_ignore). #2366

Closed
tmccombs opened this issue Dec 4, 2022 · 3 comments
Closed
Labels
wontfix A feature or bug that is unlikely to be implemented or fixed.

Comments

@tmccombs
Copy link

tmccombs commented Dec 4, 2022

Describe your feature request

It is sometimes desirable to add a global ignore rule using an absolute path. Say for example I always wanted to ignore everything in /home/myuser/.cache.

.gitignore doesn't have a mechanism to do this, because git doesn't really care about anything outside of the git repository (at least for the context of .gitignore). However, for more general tools such as ripgrep (and fd) it does make sense to want to ignore something based on an absolute path, at least for global ignore files (such as ~/.config/fd/ignore).

I see a couple ways of addressing this:

  1. Add some new syntax which is only supported in "global" ignore files (or at least not git ignore files) to indicate that a pattern should be matched against the full absolute path of the file. Maybe if the pattern starts with "//"?
  2. If the pattern starts with a "/" in a global ignore file, then treat it as an absolute path pattern. This would be backwards compatible, unless there was new API for adding an ignore file that had this behavior (and maybe ripgrep wouldn't use this, or if it did used a new flag for it?).

See sharkdp/fd#1150 (comment)

@BurntSushi
Copy link
Owner

Say for example I always wanted to ignore everything in /home/myuser/.cache.

Could you please say why you can't use a .ignore file? e.g.,

$ echo '/.cache' > /home/myuser/.ignore

Also, ripgrep will ignore .cache by default because it is a hidden directory.

If the pattern starts with a "/" in a global ignore file, then treat it as an absolute path pattern. This would be backwards compatible

It is not backwards compatible. Starting a glob pattern in a gitignore file with / anchors the pattern to the current working directory of the search. That's true in global gitignore files just as much as any other.

Overall, I don't see a compelling use case here. And even if there were a good use case, it would have to be very compelling because this is a rather complex feature to implement in an already complex part of ripgrep. Namely, if you have an absolute glob pattern, then you also need to turn every file path you're searching into an absolute path. That has a non-trivial cost, which means ripgrep would want to avoid turning everything into an absolute path if it didn't have to. Which means it would have to scan gitignore files for absolute patterns before starting the search. Which is just not great.

@BurntSushi BurntSushi closed this as not planned Won't fix, can't repro, duplicate, stale Dec 4, 2022
@BurntSushi BurntSushi added the wontfix A feature or bug that is unlikely to be implemented or fixed. label Dec 4, 2022
@tmccombs
Copy link
Author

tmccombs commented Dec 5, 2022

Could you please say why you can't use a .ignore file? e.g.,

I can think of a few reasons:

  1. You don't have write access to the necessary directory. Say you wanted to ignore /var/log/noisy-app, but didn't have root access on the system.
  2. You want your ignore rules in a more centralized location, possibly to make it easier to check it in to a git repo or similar.
  3. You don't want to clutter the parent folder with a hidden file.

But these are just hypotheses. See below.

It is not backwards compatible

Sorry, that was a typo. I meant to say that it is backwards incompatible. 

Starting a glob pattern in a gitignore file with / anchors the pattern to the current working directory of the search. That's true in global gitignore files just as much as any other.

First of all, "/" in gitignore usually anchors to the location of the .gitignore file. But for global gitignores it anchors to the the root of the current git repository. The current working directory doesn't matter beyond dtermining the current git repo. Anchoring to the current directory doesn't actually seem very useful to me. If I have a rule to ignore "/foo/bar", in my global ignore file, and there is a file at "/a/b/foo/bar/c/d", then the current behavior is if I run rg in "/a/b" it will be ignored, but if I run it from "/a" or "/a/b/foo/", then it won't be ignored. Maybe there are use cases for that, but that seems less useful than being able to say that I want to ignore "/a/b/foo/bar" regardless of where I run rg from. Or even have a pattern like "/**/foo/bar".

Also, of the two, I would prefer having a distinct syntax for absolute patterns (maybe some sigil at the beginning that indicates the remander should be matched against the absolute path, I suggested "//" becuase that seems unlikely to appear at the beginning of an existing pattern). I included the option of a backwards incompatible change to how pattersn starting wtih "/" are matched mainly for completeness.

Namely, if you have an absolute glob pattern, then you also need to turn every file path you're searching into an absolute path. That has a non-trivial cost, which means ripgrep would want to avoid turning everything into an absolute path if it didn't have to. Which means it would have to scan gitignore files for absolute patterns before starting the search. Which is just not great.

That's a legitimate concern. And I don't really see a way out of increasing the complexity, and if it would significantly add to the complexity maybe it isn't worth it. But I would like to point out a few things:

  • I am not proposing that this would be used for gitignore files. This would only be for non-git ignore files that are treated as global, such as ~/.config/fd/ignore for fd, or maybe a file passed to ripgrep with --ignore-file (or possibly via a new flag).
  • This is primarily a request to have an API that would allow doing this with the ignore crate used as a library, even if ripgrep itself doesn't make use of it.
  • Given that it would be limited in scope mentioned above, the ignore files would need to parsed before starting the search anyway.

Finally, I haven't personally run into a need for this. However, this has come up a few times for fd. Here and here (actually that one would impact the overrides API, but similar idea) and here. And since fd uses the ignore crate, it would be rather difficult to support the feature there without something changing in the ignore crate.

That last one actually points out something interesting. If the starting path passed in to WalkBuilder::new is an absolute path, then patterns that match absolute paths are ignored. And IMO at least, the fact that ignore matches ignore patterns differently depending on whether the path is passed in as "." or "/home/myuser" is rather unexpected.

@BurntSushi
Copy link
Owner

BurntSushi commented Dec 5, 2022

And IMO at least, the fact that ignore matches ignore patterns differently depending on whether the path is passed in as "." or "/home/myuser" is rather unexpected.

It is consistent with how greps behave. Notice, for example, that if you use an absolute file path as an input, then ripgrep will emit absolute file paths in the search result output. Indeed, the way paths are dealt with is almost purely symbolically. This is also simultaneously the reason why absolute glob patterns don't work when one provides a relative file path.

Bottom line here is that I've been saying that the ignore crate needs to be completely rewritten for a long time now. Its current state is basically not much better than a proof of concept. I just haven't had the time or bandwidth to come back around to the project. When I do, I'll consider issues like this one, but I would say it's unlikely to happen. It's a somewhat niche concern, although I confess it would be nice if all possible sensible things would "just work."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix A feature or bug that is unlikely to be implemented or fixed.
Projects
None yet
Development

No branches or pull requests

2 participants