Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular Expressions in Emojicode #177

Open
joeskeen opened this issue Jan 14, 2021 · 6 comments
Open

Regular Expressions in Emojicode #177

joeskeen opened this issue Jan 14, 2021 · 6 comments

Comments

@joeskeen
Copy link
Contributor

⭐️ Proposed change

Most languages have some kind of built-in support or library for using regular expressions. It would be great to see this feature in Emojicode.

🤔 Rationale

A lot of recreational coding I've tried to do in Emojicode have been programming puzzles, many of which are solved more easily using regular expressions than other methods.

🕺 Example

I'm not sure what it would look like, whether it would be string based or its own syntax like in JavaScript.

@joeskeen
Copy link
Contributor Author

joeskeen commented Dec 4, 2021

I've been thinking more about this lately, and wonder if we could implement a package for regular expressions, rather than having it built-in to the language.

There are a few guides out there for writing a RegEx engine

We could also port an existing RegEx engine from another language.

Another (perhaps simpler) option would be to make a package that wraps/links to a C++ implementation.

@thbwd
Copy link
Member

thbwd commented Dec 5, 2021

Sure, regular expressions don't require language support. Although a literal syntax for regular expressions is nice, it isn't necessary. Since the C++ standard library has regular expression support, wrapping that should be a straightforward way to implement this.

@sodiboo
Copy link

sodiboo commented Jun 18, 2022

emogex? https://gist.github.com/Terrain2/e0336c76e0a62b2ae537dfb6ffe12935

@joeskeen
Copy link
Contributor Author

I'm currently playing around with implementing an EmojiCode-native regular expressions library. I'm following this guide: https://kean.blog/post/lets-build-regex and taking inspiration from emogex (from the comment above) and emojex, while trying to make it feel as EmojiCode-native and natural as possible. When I have more to share, I will (I'm in very early stages of writing the parser in EmojiCode). I'm currently stuck by #204.

@joeskeen
Copy link
Contributor Author

joeskeen commented Nov 24, 2022

I've taken another stab at it, and I'm happy to say I have a very early working alpha of regular expressions for EmojiCode. You can check it out here: https://gist.github.com/joeskeen/98c9f0e9d04cd6f32d27015e1b88b589. Please note that its feature set is not complete when compared to some other languages, but it does work with a lot of the standard regex use cases.

All special characters in this implementation of regex are emoji, and each emoji was chosen to align with similar concepts in EmojiCode (there's a table explaining the syntax at the bottom of the Gist). Here's my sample usage test file for anyone's edification if they would like to try it out:

📜 🔤regex.🍇🔤

🏁🍇
    🔤🍇he🤜❌❌🔡👐❌❌🔢👐❌❌🍇🤛🔘o a🤜cat👐dog🤛🍺❌❌⚫🍬b🍺R🔘❌❌🔡❌❌🔢🍺❌❌🍉🍿0123🍆⚪🔘AAA🍉🔤 ➡️ pattern
    🔤he🍇0eo adogcatdogdog bbbbbbbbbbbbj009🍉2~~AAA🔤 ➡️ string

    🆕🔭❗ ➡️ regex
    😀 🔤Searching...
    string   '🧲string🧲' 
    pattern  '🧲pattern🧲'🔤 ❗
    👀regex pattern string❗ ➡️ result
    ↪️ 👌result❓ 🍇
        😀 🔤Success at index 🧲🍺🐽result❓🧲🔤 ❗
    🍉 🙅 🍇
        😀 🔤No match found🔤 ❗
    🍉
🍉

For you RegEx buffs out there, the pattern I'm using is equivalent to

^he(\w|\d|\🍇)*o a(cat|dog)+\s?b+R*\w\d+\🍉[0123].*AAA$

This outputs:

Searching...
    string   'he🍇0eo adogcatdogdog bbbbbbbbbbbbj009🍉2~~AAA' 
    pattern  '🍇he🤜❌🔡👐❌🔢👐❌🍇🤛🔘o a🤜cat👐dog🤛🍺❌⚫🍬b🍺R🔘❌🔡❌🔢🍺❌🍉🍿0123🍆⚪🔘AAA🍉'
Success at index 0

I would like to write a ton of unit tests, then look at refactoring it a bit to allow features to be grouped together (rather than having one line in every method). Once I can get that working as intended, I'd like to add more features (like capturing groups) and propose it to be included in the package listing at https://www.emojicode.org/docs/packages/.

I would welcome any feedback anyone has!

A HUGE thanks to "clumsy computer" on YouTube for his live-stream implementation of a regex engine in Python. I gained the understanding, coded along with him in TypeScript, got it working the way I wanted to, then translated it into EmojiCode.

@joeskeen
Copy link
Contributor Author

Wrote 271 unit tests, and fixed a handful of bugs! Feeling pretty good about the currently implemented functionality. (I updated the gist with the bug fixes, and the unit tests.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants