Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect or inconsistent behavior with sπŸ”‘ πŸ”β— method #208

Open
joeskeen opened this issue Nov 27, 2022 · 1 comment Β· May be fixed by #209
Open

Incorrect or inconsistent behavior with sπŸ”‘ πŸ”β— method #208

joeskeen opened this issue Nov 27, 2022 · 1 comment Β· May be fixed by #209

Comments

@joeskeen
Copy link
Contributor

joeskeen commented Nov 27, 2022

Consider the following unit tests:

πŸ“¦ testtube 🏠

🏁 ➑️ πŸ”’ πŸ‡
  ↩️ πŸ‘”πŸ†•πŸ¦”β—οΈβ—οΈ
πŸ‰

πŸ‡ πŸ¦” πŸ§ͺ πŸ‡
    
    βœ’οΈ ❗️ 🏁 πŸ‡
        πŸ”’πŸ‘‡ πŸΊπŸ”πŸ”€ABCDEFGHIπŸ”€ πŸ”€FπŸ”€β— 5 πŸ”€'F' should be at index 5 of 'ABCDEFGHI'πŸ”€β—
        πŸ”‘πŸ‘‡ πŸ”ͺπŸ”€ABCDEFGHIπŸ”€ 1 3❗ πŸ”€BCDπŸ”€ πŸ”€substring of 'ABCDEFGHI' from index 0 and length 3 should be 'BCD'πŸ”€β—

        πŸ”’πŸ‘‡ πŸΊπŸ”πŸ”€πŸΏABCπŸ†cπŸ”€ πŸ”€πŸ†πŸ”€β— 4 πŸ”€'πŸ†' should be at index 5 of '🍿ABCπŸ†c'πŸ”€β—
        πŸ”‘πŸ‘‡ πŸ”ͺπŸ”€πŸΏABCπŸ†cπŸ”€ 1 3❗ πŸ”€ABCπŸ”€ πŸ”€substring of '🍿ABCπŸ†c' from index 0 and length 3 should be 'ABC'πŸ”€β—
    πŸ‰
πŸ‰

All these tests should pass, but this is the output:

❌ Failed 'πŸ†' should be at index 5 of '🍿ABCπŸ†c' but it is 7 
4 assertions, 1 failures

From this message, it is apparent that the sπŸ”‘ πŸ”β— method is returning the index in terms of UTF-8 bytes, not grapheme index. This is inconsistent with the sπŸ”‘ πŸ”ͺ❗method, which is using grapheme index (as shown in the tests).

This inconsistency makes the sπŸ”‘ πŸ”β—method confusing at best and useless at the worst case when strings are outside of the ASCII range.

Ideally the workaround for this issue would be to first convert the string into a list via sπŸ”‘ πŸŽΆβ—, then find the index with s🍨 πŸ”β—, but unfortunately that does not exist. I intend to submit a pull request for that method shortly.

Edit: you can't implement πŸ”β— on s🍨🐚βšͺπŸ† since you can't compare two βšͺs with πŸ™Œ or anything else, and you can't cast the βšͺ to protocol πŸ˜› since it's generic 😒. Looks like the only way to do this is fix the sπŸ”‘πŸ”β— method.

@joeskeen joeskeen changed the title Incorrect or inconsistent behavior with sπŸ”’ πŸ”β— method Incorrect or inconsistent behavior with sπŸ”‘ πŸ”β— method Nov 27, 2022
joeskeen added a commit to joeskeen/emojicode that referenced this issue Nov 27, 2022
sπŸ”‘ πŸ”β— now returns grapheme index instead of UTF-8 index
@thbwd
Copy link
Member

thbwd commented Nov 30, 2022

I agree, this is very inconsistent and should be corrected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants