Skip to content

Commit

Permalink
feat: allow external scanner to use the logger in printf-like style
Browse files Browse the repository at this point in the history
  • Loading branch information
rooney committed Mar 23, 2024
1 parent 68d8e60 commit 9fa2970
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 5 deletions.
2 changes: 1 addition & 1 deletion docs/section-3-creating-parsers.md
Original file line number Diff line number Diff line change
Expand Up @@ -853,7 +853,7 @@ This function is responsible for recognizing external tokens. It should return `
* **`uint32_t (*get_column)(TSLexer *)`** - A function for querying the current column position of the lexer. It returns the number of codepoints since the start of the current line. The codepoint position is recalculated on every call to this function by reading from the start of the line.
* **`bool (*is_at_included_range_start)(const TSLexer *)`** - A function for checking whether the parser has just skipped some characters in the document. When parsing an embedded document using the `ts_parser_set_included_ranges` function (described in the [multi-language document section][multi-language-section]), the scanner may want to apply some special behavior when moving to a disjoint part of the document. For example, in [EJS documents][ejs], the JavaScript parser uses this function to enable inserting automatic semicolon tokens in between the code directives, delimited by `<%` and `%>`.
* **`bool (*eof)(const TSLexer *)`** - A function for determining whether the lexer is at the end of the file. The value of `lookahead` will be `0` at the end of a file, but this function should be used instead of checking for that value because the `0` or "NUL" value is also a valid character that could be present in the file being parsed.
- **`void (*log)(const TSLexer *, const char * message)`** - A function for logging. The `message` will be appended to the debug log, which is viewable through e.g. `tree-sitter parse --debug` or the browser's console after checking the `log` option in the [Playground](./playground).
- **`void (*log)(const TSLexer *, const char * format, ...)`** - A `printf`-like function for logging. The log is viewable through e.g. `tree-sitter parse --debug` or the browser's console after checking the `log` option in the [Playground](./playground).

The third argument to the `scan` function is an array of booleans that indicates which of external tokens are currently expected by the parser. You should only look for a given token if it is valid according to this array. At the same time, you cannot backtrack, so you may need to combine certain pieces of logic.

Expand Down
9 changes: 6 additions & 3 deletions lib/src/lexer.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#include "./subtree.h"
#include "./length.h"
#include "./unicode.h"
#include <stdarg.h>

#define LOG(message, character) \
if (self->logger.log) { \
Expand Down Expand Up @@ -284,11 +285,13 @@ static bool ts_lexer__is_at_included_range_start(const TSLexer *_self) {
}
}

static void ts_lexer__log(const TSLexer *_self, const char *message) {
static void ts_lexer__log(const TSLexer *_self, const char *fmt, ...) {
Lexer *self = (Lexer *)_self;
va_list args;
va_start(args, fmt);
va_end(args);
if (self->logger.log) {
snprintf(self->debug_buffer, TREE_SITTER_SERIALIZATION_BUFFER_SIZE, "%s",
message);
snprintf(self->debug_buffer, TREE_SITTER_SERIALIZATION_BUFFER_SIZE, fmt, args);
self->logger.log(self->logger.payload, TSLogTypeLex, self->debug_buffer);
}
}
Expand Down
2 changes: 1 addition & 1 deletion lib/src/parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ struct TSLexer {
uint32_t (*get_column)(TSLexer *);
bool (*is_at_included_range_start)(const TSLexer *);
bool (*eof)(const TSLexer *);
void (*log)(const TSLexer *, const char *);
void (*log)(const TSLexer *, const char *, ...);
};

typedef enum {
Expand Down

0 comments on commit 9fa2970

Please sign in to comment.