You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to formulate a generic Tree-sitter injection query that mimics Pycharm's language injection comments. These are Python comments with the following form:
# language=<language name>
When located directly above a string literal, they specify the language that is to be injected into the string. For example, in the snippet below, the string literal assigned to the variable a will be highlighted as a Jinja2 template.
# language=jinja
a = """
{% for i in list %}
{{ i }}
{% endfor}
"""
Suppose I only care about being able to inject the Jinja language, then this Tree-sitter query suffices:
But there are several other languages I'd like to be able to inject, such as html, xml, regex, and sql. The most obvious approach would be to duplicate the query above for each language, replacing 'jinja' by the corresponding language name.
In maintainability terms, though, that doesn't spark joy. Unless I generate my queries from a template, I'd be duplicating a lot of query code that might need minor tweaks later on. Also, if in theory I wanted to make all languages injectable, then I'd have hundreds of injection queries. (I haven't tested if this would have an impact on performance.)
Instead, I'd much rather be able to express my goal using a single generic query. This is where I'm stuck.
Ideally, instead of this...
(#eq? @language-hint "# language=jinja")
... I could do something like this...
(#match? @language-hint "# language=([a-z]+)")
... and in #set! injection.language refer back to the language name captured by the regex.
However, I can scrap this idea almost immediately, with 'backrefs to capture groups' appearing nowhere in the Tree-sitter docs. Onto the next best thing.
As always, let's have a look at how Neovim does it. I've come across several discussion threads where the solution seems to be to use the directive offset!, which lets one capture a substring of a node's content, e.g. the name of the language specified in the language injection comment.
With offset!, my language-specific query can be made generic:
Unfortunately, I can find no reference to this directive in the Treesitter rust bindings or in the Helix codebase. It would appear that it, along with several others, is specific to Neovim.
My questions:
Has anyone else tackled this problem?
Have I maybe missed something obvious? (Bear in mind, I'm pretty new to Tree-sitter; this is all trial and error)
And finally, does anyone see a future where some of those sweet Neovim-specific TS directives could be ported to Helix?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm trying to formulate a generic Tree-sitter injection query that mimics Pycharm's language injection comments. These are Python comments with the following form:
When located directly above a string literal, they specify the language that is to be injected into the string. For example, in the snippet below, the string literal assigned to the variable
a
will be highlighted as a Jinja2 template.Suppose I only care about being able to inject the Jinja language, then this Tree-sitter query suffices:
But there are several other languages I'd like to be able to inject, such as html, xml, regex, and sql. The most obvious approach would be to duplicate the query above for each language, replacing 'jinja' by the corresponding language name.
In maintainability terms, though, that doesn't spark joy. Unless I generate my queries from a template, I'd be duplicating a lot of query code that might need minor tweaks later on. Also, if in theory I wanted to make all languages injectable, then I'd have hundreds of injection queries. (I haven't tested if this would have an impact on performance.)
Instead, I'd much rather be able to express my goal using a single generic query. This is where I'm stuck.
Ideally, instead of this...
... I could do something like this...
... and in
#set! injection.language
refer back to the language name captured by the regex.However, I can scrap this idea almost immediately, with 'backrefs to capture groups' appearing nowhere in the Tree-sitter docs. Onto the next best thing.
As always, let's have a look at how Neovim does it. I've come across several discussion threads where the solution seems to be to use the directive
offset!
, which lets one capture a substring of a node's content, e.g. the name of the language specified in the language injection comment.With
offset!
, my language-specific query can be made generic:Unfortunately, I can find no reference to this directive in the Treesitter rust bindings or in the Helix codebase. It would appear that it, along with several others, is specific to Neovim.
My questions:
Beta Was this translation helpful? Give feedback.
All reactions