Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEEDBACK] Isolating quoted patterns on the outside adds a lookahead to the syntax #787

Open
eemeli opened this issue May 13, 2024 · 3 comments
Labels
Preview-Feedback Feedback gathered during the technical preview syntax Issues related with MF Syntax

Comments

@eemeli
Copy link
Collaborator

eemeli commented May 13, 2024

An observation from implementing bidi isolation as proposed in #781, but which also applies to the currently proposed design for bidi usability:

Isolating quoted patterns on the outside adds LRI, RLI & FSI to the set of characters (currently { and .) that could start a quoted message with no declarations, as in \u2066{{hello}}\u2069.

This doesn't make the syntax ambiguous as the {{ isn't valid in a simple-message, but it does add a lookahead of one token to the parser.

The same lookahead is also required in variant, to determine whether a \u2066 starts a quoted key, or a quoted-pattern.

The simplest change to avoid this lookahead would probably be to place the open-isolate and close-isolate between the braces, as in {\u2066{hello}\u2069}. In this position, it would also match what's proposed for expression and markup.

@eemeli eemeli added syntax Issues related with MF Syntax Preview-Feedback Feedback gathered during the technical preview labels May 13, 2024
@aphillips
Copy link
Member

Putting the isolate between the pattern quotes would mean that there are two sequences for opening/closing. And it is harder for tools to insert (or remove) the isolates. It's cognitive burden on everyone, although admittedly it's clever.

Note that the isolates (unless inside of a literal) are ignorable and can be stripped from the message.

@eemeli
Copy link
Collaborator Author

eemeli commented May 13, 2024

Putting the isolate between the pattern quotes would mean that there are two sequences for opening/closing.

This is also the case with isolates outside the quotes. The current proposal has:

  • {{, \u2066{{, \u2067{{, or \u2068{{ for opening and
  • }} or }}\u2069 for closing the pattern;

I'm suggesting that we instead use

  • {{, {\u2066{, {\u2067{, or {\u2068{ for opening and
  • }} or }\u2069} for closing the pattern.

And it is harder for tools to insert (or remove) the isolates.

Both solutions are just as easy or hard to deal with. As MF2 may include e.g. |{{}}| as a valid quoted literal, a proper MF2 parser is required to apply any such changes.

@aphillips
Copy link
Member

I think the difference is (especially if we make the pairing optional!) that the open and close isolates can just be ignored in the current design. With optional pairing, we can push the isolate characters back into the s production. Anyway, let's discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Preview-Feedback Feedback gathered during the technical preview syntax Issues related with MF Syntax
Projects
None yet
Development

No branches or pull requests

2 participants