[Feature request] Pratt parsing with spans #507

stefnotch · 2023-08-22T11:23:54Z

When using the new pratt parsing feature, how would I create a syntax tree which includes spans?

e.g. How does one get spans that include the operators? And how does one get the spans of the operators themselves?

Lines 709 to 719 in 01b96cd

 let parser = atom 

 .pratt(operator) 

 .with_postfix_ops(choice(( 

 // Because we defined '+' and '-' as left associative operators, 

 // in order to get these to function as expected, their strength 

 // must be higher, i.e. they must bind tighter 

 postfix(just('!'), 2, |lhs| Expr::Factorial(Box::new(lhs))), 

 // Or weirdness happens 

 postfix(just('$'), 0, |lhs| Expr::Value(Box::new(lhs))), 

 ))) 

 .map(|x| x.to_string());

stefnotch · 2023-08-22T11:24:48Z

I did try map_with_span, but that gives me a PrattOpOutput. I'm not sure if I can do anything with it.

On that note, why is type InfixBuilder<E> = fn(lhs: E, rhs: E) -> E; a fn? Most other chumsky APIs accept a Fn, which makes some tasks much easier.

Edit: I see, it's because getting a Fn to work in that case is tricky #464 (comment)
Would it be possible to extend the InfixBuilder to also have access to the infix token parser result? Same for prefix and postfix parsers. Then it would at least be possible to add some info to the token parser, and pass it into the infix builder.

stefnotch · 2023-08-22T21:39:17Z

This is my attempt.

https://github.com/zesterer/chumsky/compare/main...stefnotch:pratt-with-op?expand=1

I hope it's roughly the right direction, but one thing that's missing is the ability to write build functions that take an operator that has a different type. For example, here the build function takes a char as the operator, and two Exprs as the children. Supporting this probably requires passing another generic type through all the functions.

Zij-IT · 2023-08-23T06:42:42Z

I’ll give it a look today to see what I can do to get it to give you the op spanned. A quick and dirty way to do it would be to use the span’s of the expressions to calculate the span of the operator, as the operator’s span is the space between the two expr’s (though whitespace makes that not 100% true)

stefnotch · 2023-08-23T07:09:04Z

@Zij-IT

I think there's a certain value in having access to the entire parsed operator. For example, when parsing sufficiently complex mathematical expressions, then there is Knuth's up arrow notation.
https://en.m.wikipedia.org/wiki/Knuth%27s_up-arrow_notation
Which has an operator with an exponent. That would be most easily parsed by writing a "up arrow followed by exponent" parser for the operator.

Zij-IT · 2023-08-23T08:04:56Z

@Zij-IT

I think there's a certain value in having access to the entire parsed operator. For example, when parsing sufficiently complex mathematical expressions, then there is Knuth's up arrow notation. https://en.m.wikipedia.org/wiki/Knuth%27s_up-arrow_notation Which has an operator with an exponent. That would be most easily parsed by writing a "up arrow followed by exponent" parser for the operator.

Excellent point. I’ll give it a shot tonight and see what I can do!

Zij-IT · 2023-08-24T16:30:55Z

So, here is some API questions that I have. How does this look?

let parser = atom 
    .pratt(choice((
        left_infix(just('-'), 0, |lhs, op, rhs| Expr::Sub(Box::new(lhs), Box::new(rhs))),
        left_infix(just('+'), 0, |lhs, op, rhs| Expr::Add(Box::new(lhs), Box::new(rhs))),
    ))) 
    .with_postfix_ops(
        // For postfix the `op` is on the right
        postfix(just('!'), 0, |lhs, op| Expr::Factorial(Box::new(lhs))),
    ) 
    .with_prefix_ops(
        // For prefix the `op` is on the left
        prefix(just('-'), 0, |op, rhs| Expr::Negate(Box::new(rhs))),
    )
    .map(|x| x.to_string());

Or do we follow str::split_once and str::rsplit_once and keep the op placement consistent so that both use fn(Op, Expr) -> Expr

stefnotch · 2023-08-24T18:22:27Z

That looks like a good API. I'd keep the |lhs, op| design, if the infix operator's callback is |lhs, op, rhs|.

If it's |op, lhs|, then I'd expect the infix operator's callback to also look roughly like that. That's actually what I did above, mostly because I wanted to have an easy time implementing it. There I have an API that looks like left_infix(just('-'), 0, |op, [lhs, rhs]| Expr::Sub(Box::new(lhs), Box::new(rhs))). That made using the current Mode API very easy.
https://github.com/zesterer/chumsky/compare/main...stefnotch:pratt-with-op?expand=1#diff-350f258948b06701ad68bdfb8528d5fdc35c29e5d6a4bdbd0c7673eea00a7e3fR365-R366

stefnotch · 2023-08-27T19:22:35Z

Did you have any luck with a proper implementation so far, or are some parts trickier than expected? Is there anything I could do to help?

In my little experiments with extending the Pratt parsing implementation, I never managed to figure out what exactly Expr in impl<'a, P, Expr, I, O, E> ParserSealed<'a, I, PrattOpOutput<PrefixBuilder<Expr>>, E> was used for, which sadly limited my understanding of that bit of code.

stefnotch mentioned this issue Aug 28, 2023

Pass operator to Pratt parsing functions #514

Closed

zesterer mentioned this issue Aug 29, 2023

Pratt parsing with Fn traits #515

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Pratt parsing with spans #507

[Feature request] Pratt parsing with spans #507

stefnotch commented Aug 22, 2023 •

edited

stefnotch commented Aug 22, 2023 •

edited

stefnotch commented Aug 22, 2023

Zij-IT commented Aug 23, 2023

stefnotch commented Aug 23, 2023

Zij-IT commented Aug 23, 2023

Zij-IT commented Aug 24, 2023

stefnotch commented Aug 24, 2023 •

edited

stefnotch commented Aug 27, 2023

[Feature request] Pratt parsing with spans #507

[Feature request] Pratt parsing with spans #507

Comments

stefnotch commented Aug 22, 2023 • edited

stefnotch commented Aug 22, 2023 • edited

stefnotch commented Aug 22, 2023

Zij-IT commented Aug 23, 2023

stefnotch commented Aug 23, 2023

Zij-IT commented Aug 23, 2023

Zij-IT commented Aug 24, 2023

stefnotch commented Aug 24, 2023 • edited

stefnotch commented Aug 27, 2023

stefnotch commented Aug 22, 2023 •

edited

stefnotch commented Aug 22, 2023 •

edited

stefnotch commented Aug 24, 2023 •

edited