-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF8 characters cause valid links to be detected as broken #234
Labels
Comments
I have the same problem with websites in Chinese and Thai languages. |
I have the same problem with grave accents and acute accents, those are very common in Latin languages and present in other languages too. For example https://www.iswatersafetodrink.in/Italy/Cantù |
Can I do anything so as the first step "needs confirmation" can be dropped? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I prepared test case with https://github.com/matkoniecz/broken-link-checker-local-utf8
blc https://matkoniecz.github.io/broken-link-checker-local-utf8 -r
See https://matkoniecz.github.io/broken-link-checker-local-utf8/ - both link work, one with utf8 characters gets BLC_UNKNOWN/HTTP_undefined errors
Sorry if that is my misunderstanding but as I understand it the UTF8 is de facto working in links
UTF8 may be internally different but browsers seems 100% fine with links including letters like https://en.wikipedia.org/wiki/Ogonek
Sanity check: https://stackoverflow.com/questions/22357509/can-urls-have-utf-8-characters
Even DNS supports URF8 characters (with some workarounds and restrictions) https://en.wikipedia.org/wiki/Internationalized_domain_name
replaces LukasHechenberger/broken-link-checker-local#50
The text was updated successfully, but these errors were encountered: