Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow emoji domain names #420

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mhlz
Copy link

@mhlz mhlz commented Jul 8, 2023

Problem

Currently the ruby version of this library does not recognize links to domains that include emojis, even though browsers support those domains. Texts that include "https://馃寛馃寛馃寛.st" will not be accepted as a valid URL. The problem comes from idn-ruby and libidn2, which does not recognize emoji characters as valid for domain names, even though they are registerable and work fine in browsers (after being translated into punycode).

Solution

I replaced idn-ruby with another rubygem that implements the punycode conversion in ruby directly without native dependencies and then added some validation that libidn2 did to pass the conformity test suite again.

Result

Texts that include "https://馃寛馃寛馃寛.st" will now correctly identified as including a link. Note that currently there are more checks in this library that prevent "馃寛馃寛馃寛.st" from being parsed as a link. While I would like to make that work as well, I felt like that would be too big of a change.

mhlz added 2 commits July 8, 2023 22:28
idn-ruby is uses libidn2, which, unfortunately, does not recognize
domain names that include emojis such as 馃寛馃寛馃寛.st, even though they are
valid and work in pretty much any modern browser.

Additionally the JS implementation of twitter-text already does
recognize links such as https://馃寛馃寛馃寛.st as valid link entities.

Replacing idn-ruby with another ruby gem that impelements the punycode
conversion and adding some new validation to the is_valid_domain
function allows for https://馃寛馃寛馃寛.st to be returned as a valid link
entity, without accepting any other invalid links that are tested in the
conformity test suite.
@CLAassistant
Copy link

CLAassistant commented Jul 8, 2023

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants