Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TreeCursor::goto_first_child_for_byte cannot find nodes after ERROR nodes #3270

Open
VonTum opened this issue Apr 9, 2024 · 0 comments
Open

Comments

@VonTum
Copy link

VonTum commented Apr 9, 2024

Problem

I encounter the problem when working on the front end for my own language compiler (https://github.com/pc2/tree-sitter-sus). I'm using the rust bindings for it.

It appears that the function TreeCursor::goto_first_child_for_byte cannot find nodes after ERROR nodes.

Steps to reproduce

Grab a copy of a grammar that looks at least like this:

rules: {
        source_file: $ => repeat(field('item', $.module)),
        
        module: $ => ....
        
        single_line_comment : $ => /\/\/[^\n]*/,
        multi_line_comment : $ => /\/\*[^\*]*\*+([^\/\*][^\*]*\*+)*\//,
  ....
},
extras: $ => [
        /\s+/,
        $.single_line_comment,
        $.multi_line_comment
]

Some code that illustrates how I'm attempting to use the function:

// Enter the byte range of a node coming after an ERROR node
let desired_span : Range<usize> = ...;
println!("DESIRED: {:?}", desired_span);
let mut cursor = tree.walk();
// First display the nodes and all their byte ranges
assert!(cursor.goto_first_child());
loop {
    let node = cursor.node();
    println!("{}: {:?}", node.kind(), node.byte_range());
    if !cursor.goto_next_sibling() {break;}
}
assert!(cursor.goto_parent());
// This unwrap fails, even though it should find the node we input at desired_span
let _ = cursor.goto_first_child_for_byte(desired_span.start).unwrap();

If I run this code I get the following: (outer code loops through all modules, so that 3rd module is the first one to error. Adding or removing errors to the program file moves the one which breaks, which is always the one right after the error.

DESIRED: 192..652
module: 1..56
module: 58..166
ERROR: 168..186
module: 192..652
module: 655..895
multi_line_comment: 898..936
ERROR: 938..975
module: 976..1071
multi_line_comment: 
....

In fact, if I replace cursor.goto_first_child_for_byte with this:

let desired_span : Range<usize> = ...;
let mut cursor = tree.walk();
assert!(cursor.goto_first_child());
loop {
    let node = cursor.node();
    if node.byte_range() == desired_span {break}
    assert!(cursor.goto_next_sibling());
}
//cursor.goto_parent();
//let _ = cursor.goto_first_child_for_byte(span.into_range().start).unwrap();

it works as I would expect cursor.goto_first_child_for_byte to work.

PS it appears extras can also break it the same way. I tried adding a comment before an earlier module, and that also breaks it:

DESIRED: 66..174
module: 1..56
single_line_comment: 58..65
module: 66..174
ERROR: 176..194
module: 200..660
module: 663..903
multi_line_comment: 906..944
ERROR: 946..983
module: 984..1079

Expected behavior

I would expect TreeCursor::goto_first_child_for_byte to find the first child for that byte, regardless of ERROR nodes, or extra nodes encountered along the way.

Tree-sitter version (tree-sitter --version)

tree-sitter 0.22.2

Cargo.toml:

[dependencies]
tree-sitter = "~0.22.2"

Operating system/version

TUXEDO OS 2 x86_64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants