New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REQUEST] Rich Should Accept Highlights as re.compiled re.Patterns and Use them Internally #3345
Comments
You're only changing when the regexes are compiled. Either you do it the first time you use it, or you do it at import time. Once compiled, there is going to be negligible differences between the two approaches. I wouldn't want the builtin highlighters to use the pre-compiling approach, because startup-time for CLIs is a concern. But if you want to PR the change to |
Hi Will, Thanks for taking the time to review the issue and make a comment. The whole time I was doing the writeup I kept trying to figure out what I was missing and the startup for CLIs is definitely it. That makes complete sense. I'll make the PR for |
Rich should take advantage of the potential speed increases through compiled regular expressions in the
re.compile
function in the stdlibre
module.I have created a fork here: https://github.com/PyWoody/rich/tree/re_compiled that has the changes in place for demoing.
Using the EmailHighlighter example from the docs, a new Highlighter instance could be created like so
Note, the above example will already work in the default version because
re.finditer
automatically compiles are.Pattern
or string to are.Pattern
, as shown here: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L219, but it does not save it for re-use. The_compile
function inre
will do some caching automatically, as shown here: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L280, but it will be called every single timerich.text.Text.highlight_regex
is called versus just saving the compiled version yourself.The more regular expressions a Highlighter uses the more the
re.Patterns
will be cached, further allowing speed increases. For instance, therich.highlighter.ISO8601Highlighter
found updated here: https://github.com/PyWoody/rich/blob/re_compiled/rich/highlighter.py#L144, has a considerable speed increase compared to the default version.The major caveat will be for custom Highlighters that use strings exclusively. There will be a marginal speed decrease in these situations as each call will need to be
isinstance
d checked andre.compile
d on demand. This is evident in thehighlight_regex
method inrich.text.Text
class found updated here: https://github.com/PyWoody/rich/blob/re_compiled/rich/text.py#L615. In my testing, the decrease was marginal enough to be difficult to extract a difference from the noise.The net-net is basically using
re.compile
for default Highlighters is a free win, people that want to usere.compile
in their custom highlighters get the speed boost, and existing Highlighters out in-the-wild or people that want to use strings exclusively only receive marginal speed decrease.The text was updated successfully, but these errors were encountered: