Replies: 1 comment
-
Tree-sitter has a feature that it can rescan a token with the internal lexer if an external scanner declined to scan a token by returning false. See #922 for more details. Does this feature can be used to build notifications logic for your use cases? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In my grammar, I parse a number of context-sensitive constructs from an external scanner. These context-sensitive constructs interact with "normal" language constructs defined in the grammar itself in various ways; usually, the "normal" language constructs delimit the context-sensitive constructs (example: parentheses of various sorts,
IF expr THEN
whereexpr
can be a context-sensitive expression, and so forth). So I end up with a design where I have to maintain some amount of synchrony between the grammar parser state and the external scanner state. I've been handling this by moving various keywords associated with the start or end of "normal" constructs into the external scanner, but I've found this bloats the parser size by quite a bit; in one dramatic example, moving a single two-character keyword token from my grammar to my external scanner instantly bloated it from 30 MB to 56 MB.One possible solution is to add another API to external scanners, so in addition to Create/Destroy/Serialize/Deserialize/Scan, there's also a Notify method with the following API:
and in the grammar.js, there is a
notify
field with a list of tokens - probably restricted to terminals:where after the parser encounters one of those terminal symbols, it calls the Notify function on the external scanner with the
int32_t token
parameter set to the index of that symbol in thenotify
list. The external scanner can then casttoken
to an enum to figure out which token was encountered, then modify its state accordingly.Anyway this is just an idea that's been bouncing around my head as I often run across a place where I could use it. It was inspired by this bug in my grammar, although I would also use it in other places (1, 2). What do people think?
Beta Was this translation helpful? Give feedback.
All reactions