Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PugiXML incorrectly handles char8_t in C++20 #378

Open
ytimenkov opened this issue Oct 29, 2020 · 3 comments
Open

PugiXML incorrectly handles char8_t in C++20 #378

ytimenkov opened this issue Oct 29, 2020 · 3 comments

Comments

@ytimenkov
Copy link

In C++20 there is a std::u8string and char8_t to make UTF-8 type-safe.

Unfortunately when calling xml_attribute::set_value(u8"a string") for example compiler chooses set_value(bool) overload which lead to a surprise when all attrubutes became juse "true".

Which makes me thing that providing an overload which accepts bool is too wide: it can be any pointer at least. Would be nice to constrain it somehow.

I'm not sure if there is a need to have a separate handling when pugi::char_t is wchar_t or simply provide such overload when it's a char.

@ytimenkov
Copy link
Author

I also thought that it may be a good idea to just use char8_t for pugixml::char_t since UTF-8 is used internally anyways.

I think if consumer could define PUGIXML_TEXT and PUGIXML_CHAR directly instead of relying on PUGIXML_WCHAR_MODE things will just work (or provide more knobs...)

@zeux
Copy link
Owner

zeux commented Oct 29, 2020

Which makes me thing that providing an overload which accepts bool is too wide: it can be any pointer at least. Would be nice to constrain it somehow.

This is sensible; this can probably be achieved using a private overload with const void* argument, as you can't use enable_if or other constructs like this due to compatibility requirements. Separately, char8_t as char_t won't work because of reliance on some CRT functions like strcmp.

@ytimenkov
Copy link
Author

Separately, char8_t as char_t won't work because of reliance on some CRT functions like strcmp.

Oh, I didn't look that deep, but it feels like case for std::char_traits<char_t>::compare...

I guess I should continue sticking to reinterpret_cast for now, just use it carefully :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants