Exclusive literal tokens for external scanners #922
-
Current implementation has behavior that if an external scanner doesn't return a token and the token defined in the grammar.js externals as a string or a regexp than there works a fallback mechanism and an internal scanner of tree-sitter's lexer will rescan a piece of input again and can match the token. In some situations the above may be an undesired behavior:
ExamplesIn the next grammar example module.exports = grammar({
externals: $ => [
$.interpolationStart, // "${"
"}",
],
rules: {
interpolation: $ => seq($.interpolationStart, $.body, "}"),
body: $ => ...,
}
}); In the next situation if the external scanner returns false and there is module.exports = grammar({
externals: $ => [
"${",
"}",
],
rules: {
interpolation: $ => seq("${", $.body, "}"),
body: $ => ...,
}
}); Such behavior helps to write less code in the external scanners if some conditions don't meet. What I suggest is to introduce a special marker on the grammar that will mark such simple tokens defined like module.exports = grammar({
externals: $ => [
exclusive("${"), // The exclusive(token) is proposed DSL function that will mark tokens
// by setting an additional flag in the json representation of the grammar.
"}",
],
rules: {
interpolation: $ => seq("${", $.body, "}"),
body: $ => ...,
}
}); This will allow to continue to use literal representation of such tokens through the grammar rules, what will keeps it simpler for understanding. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
@maxbrunsfeld could you, please, tell what do you think about above suggestion? |
Beta Was this translation helpful? Give feedback.
-
To me, it does not seem worthwhile to add a new API for this purpose. As you said, you can already express this behavior using your first example. I think each API function that we add increases the potential for confusion, so I have tried to avoid any functions that aren't strictly required to implement a certain parsing behavior. |
Beta Was this translation helpful? Give feedback.
To me, it does not seem worthwhile to add a new API for this purpose. As you said, you can already express this behavior using your first example. I think each API function that we add increases the potential for confusion, so I have tried to avoid any functions that aren't strictly required to implement a certain parsing behavior.