extract Keyword break my grammar #2711
-
While experimenting for a general tree-sitter grammar for log files, I've come to the point where I want to extract nodes whether there are within parentheses or not.
A simple grammar that does exactly what I want is this one: rules: {
log_file: $ => repeat(
choice(
$.apple,
$.apple_parenthesis,
)
),
apple: $ => "apple",
apple_parenthesis: $ => seq('(', $.apple, ')'),
} That's great but if add some word with "apple" in them, they also get selected. For instance, the "bad_apple" in this log get also selected:
If I understand well, the way to emulate "word_bound"/ word: $ => $.word,
rules: {
log_file: $ => repeat(
choice(
$.apple,
$.apple_parenthesis,
$.word,
)
),
apple: $ => "apple",
apple_parenthesis: $ => seq('(', $.apple, `)`),
word: $ => /\S+/,
} With this
As Obviously, something like choice("apple", "(apple)"), will select what I want but my goal is to access the inner apple for syntax highlighting . Please could you help me to select only the good apples? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hi! I spent some a little time working on this and I think I figured it out :). I think the issue was your use of module.exports = grammar({
name: 'apple',
word: $ => $.word,
rules: {
log_file: $ => repeat(
choice(
$.apple,
$.apple_parens,
$.word
)
),
apple: $ => "apple",
apple_parens: $ => seq('(', $.apple, `)`),
word: $ => /[^\s()]+/
}
}); |
Beta Was this translation helpful? Give feedback.
-
Thanks for your help! I've also ended up with the same solution on my side, see this grammar |
Beta Was this translation helpful? Give feedback.
Hi! I spent some a little time working on this and I think I figured it out :). I think the issue was your use of
\S
forword
, as that seems to grab the parentheses in(apple)
. I used a regex forword
below to match any non-whitespace character besides '(' and ')', but you could also potentially use something likeword: $ => /[a-zA-Z0-9_]+/
.