New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rg allocates too much memory with: rg --files --ignore-file ~/.ultimate-gitignore
#2750
Comments
rg --files --ignore-file ~/.ultimate-gitignore
rg --files --ignore-file ~/.ultimate-gitignore
This appears to be a regression from ripgrep 13:
My bet is that it's related to rust-lang/regex#1116. And that in turn is probably related to the regex engine rewrite that landed in ripgrep 14. |
Ah, Thanks Andrew. It may be worth it for me to note and for you to know that: Repeat runs of: But repeat runs of: That is odd. both |
Please tick this box to confirm you have reviewed the above.
What version of ripgrep are you using?
How did you install ripgrep?
sudo pacman -S ripgrep
(Arch Linux package manager)What operating system are you using ripgrep on?
Arch Linux on WSL2
Describe your bug.
ripgrep process uses 7.1 Gibibytes of memory and is really slow (before getting killed by OS for allocating too much RAM),
when you run
rg --files --ignore-file ~/.ultimate-gitignore
, and.ultimate-gitignore
being a big/huge .gitignore file, with 16332 lines (304KB), 9298 entries (or lines with empty lines and comments are removed)this "Ultimate .gitignore" file is created by running
cat * > .ultimate-gitignore
in this directory github.com/toptal/gitignore/templatesWhat are the steps to reproduce the behavior?
Nope, the corpus is not too big, it is only 2 files and a .git directory, I made sure to test it on a small corpus but my original use case was to use this command anywhere on disk, e.g. the
$HOME
directoryYou can download these 2 files and .git directory by cloning this repo: wis/killall-for-Windows
and again, the command is:
rg --files --ignore-file ~/.ultimate-gitignore
What is the actual behavior?
command:
output:
The command runs for 10 seconds and the process allocates 7.1 Gibibytes and has on average 50% CPU usage/utilization.
What is the expected behavior?
List all the files in the current directory, excluding the ones that match the patterns in the
.ultimate-gitignore
fileEDIT: I tried using
fd
, as Andrew recommends here:but it also suffers from the same issue, the process allocates too much memory and gets killed by the OS.
I am starting to think this issue is fundamentally unfixable, given the nature of ripgrep's (and fd's) implementation,
I don't think you can "load" and compile this much patterns into memory and have this grantee by the regex crate:
It's either ripgrep uses, or keeps using this fast regex implementation, which seems to me to trade off memory for speed,
or it's either ripgrep would be able to run this command, or a command that has this many patterns.
It's like:
—pick one
EDIT2: I read the section on memory in the manpage, as recommended by Andrew in this comment here:
...and I then ran the command above with the
-j1
flag, as recommended in the manpage,the command worked, it finished and exited sucessfully, but it was quite slow, even on a directory as small as the one mentioned above, with 2 files and 1 directory in it.
the command took 0.9 seconds to finish, on average, whereas
rg --files
takes 0.025 seconds (25 milliseconds) to finish, on average.The text was updated successfully, but these errors were encountered: