Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: alias scanning in files #408

Closed
wants to merge 7 commits into from
Closed

perf: alias scanning in files #408

wants to merge 7 commits into from

Conversation

jazanne
Copy link
Contributor

@jazanne jazanne commented Nov 3, 2023

Improve performance of alias scanning in files in two ways

  1. cache the results of doublestar.FilepathGlob(glob) so that consecutive calls with same pattern can use results
  2. swap out golang regexp with https://github.com/wasilibs/go-re2

With just change 1 I still found the tools running over 30 min (i stopped it before it finished) in one repo, but by adding change 2 it cut the time to under 10 min. Similarly, just implementing change 2 saw run times over 20 min (i stopped it before it finished).

This isn't a total fix for performance of scanning, depending on size of repo and wildcard patterns used, but it certainly helps.

UPDATE:
pulling back on the re2 usage since the build fails - we can revisit that later

@jazanne jazanne linked an issue Nov 3, 2023 that may be closed by this pull request
@jazanne jazanne marked this pull request as ready for review November 3, 2023 18:02
@jazanne jazanne requested review from a team as code owners November 3, 2023 18:02
@@ -0,0 +1 @@
EVEN_WILDER = 'wildFlag'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂

Copy link
Contributor

@ld-kyee ld-kyee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't say I understand entirely how this all works but the code seems 💯

Copy link
Contributor

@mmrj mmrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to allow work to proceed. The Docs team is codeowners of *.md files in this repo, so are autotagged here, but I did not review the vendor *.md files carefully.

@jazanne
Copy link
Contributor Author

jazanne commented Nov 13, 2023

unfortunately the build is failing now that we're using re2, so i'm going to revert that change

@jazanne jazanne marked this pull request as draft February 13, 2024 14:42
@jazanne jazanne closed this Mar 19, 2024
@jazanne jazanne deleted the jwhite/globby branch March 19, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tool gets stuck when defining glob filepattern on big repo
3 participants