Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support unicode in paths #998

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

xfc1939
Copy link

@xfc1939 xfc1939 commented Dec 21, 2023

Fixes #984

@codecov-commenter
Copy link

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (931323d) 64.14% compared to head (0932ea5) 64.32%.
Report is 8 commits behind head on master.

Files Patch % Lines
src/logging.cc 80.00% 3 Missing and 4 partials ⚠️
src/utilities.cc 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #998      +/-   ##
==========================================
+ Coverage   64.14%   64.32%   +0.18%     
==========================================
  Files          17       17              
  Lines        3327     3330       +3     
  Branches     1125     1126       +1     
==========================================
+ Hits         2134     2142       +8     
+ Misses        766      760       -6     
- Partials      427      428       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sergiud
Copy link
Collaborator

sergiud commented Jan 3, 2024

Thanks for the PR. I need some time to look through the changes.

In the meantime, could you please rebase onto current head and resolve the conflicts?

@sergiud sergiud linked an issue Jan 3, 2024 that may be closed by this pull request
Copy link
Collaborator

@sergiud sergiud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I had a careful look at the implementation, I see two major issues:

  1. Switching to std::wstring under UNICODE definition will break existing user code. This is something we definitely want to avoid. Support for wide strings should be transparent.
  2. Many of the functions are duplicated with respect to wide strings. For maintenance purposes, the std::basic_string code should be unified using templates to encapsulate the logic into a single function or method.

src/glog/logging.h.in Outdated Show resolved Hide resolved
src/glog/logging.h.in Outdated Show resolved Hide resolved
src/googletest.h Outdated Show resolved Hide resolved
src/logging.cc Show resolved Hide resolved
@xfc1939
Copy link
Author

xfc1939 commented Jan 11, 2024

Now I have modified the DefaultDir() function by used template. Could you give me some suggestions.

template<class CharT>
static const std::basic_string<typename std::enable_if_t<std::is_same_v<CharT, wchar_t>, CharT>> DefaultLogDir() {
        std::wstring env = getenvw((wchar_t*)TEXT("GOOGLE_LOG_DIR"));
        if (!env.empty()) {
            return env;
        }
        env = getenvw((wchar_t*)TEXT("TEST_TMPDIR"));
        return env;
}

template<class CharT>
static const std::basic_string<typename  std::enable_if_t<std::is_same_v<CharT, char>, CharT>> DefaultLogDir() 
{
    const char* env;
    env = getenv("GOOGLE_LOG_DIR");
    if (env != nullptr && env[0] != '\0') {
        return env;
    }
    env = getenv("TEST_TMPDIR");
    if (env != nullptr && env[0] != '\0') {
        return env;
    }
    return "";
}

template<class CharT>
static const std::basic_string<typename  std::enable_if_t<!std::is_same_v<CharT, char> && !std::is_same_v<CharT, wchar_t>, CharT>> DefaultLogDir()
{
    return {};
}

@sergiud
Copy link
Collaborator

sergiud commented Jan 11, 2024

In the code above, DefaultDir sill uses duplicated logic while the only difference is the use of getenv for ANSI and Unicode strings. It is therefore more meaningful to provide a unified overload for the corresponding functions.

Something like this should do the job:

namespace details {
using std::getenv; // char variant
const wchar_t* getenv(const wchar_t* name) {
    return _wgetenv(name);
}
} // namespace internal

template<class Ch, std::size_t N>
std::basic_string<Ch> DefaultLogDir(const std::array<const Ch*, N>& names) {
    for (const Ch* name : names) {
        if ((const Ch* var = details::getenv(name)) != nullptr) {
            return var;
        }
    }
    return {};
}

Now you only need to specialize the literals:

template<class Ch>
struct LogDirEnvVar;

template<>
struct LogDirEnvVar<char> {
    constexpr static std::array<const char*, 2> names() noexcept {
        return {"GOOGLE_LOG_DIR", "TEST_TMPDIR"};
    }
};

template<>
struct LogDirEnvVar<wchar_t> {
    constexpr static std::array<const wchar_t*, 2> names() noexcept {
        return {L"GOOGLE_LOG_DIR", L"TEST_TMPDIR"};
    }
};

Eventually, you can define

template<class Ch>
decltype(auto) DefaultLogDir() {
    return DefaultLogDir(LogDirEnvVar<Ch>::names());
}

and use it as follows:

auto dir1 = DefaultLogDir<char>();
auto dir2 = DefaultLogDir<wchar_t>();

However, I'm not sure this is really necessary here. Can't you just always use the wide string variant here?

@xfc1939
Copy link
Author

xfc1939 commented Jan 13, 2024

In the code above, DefaultDir sill uses duplicated logic while the only difference is the use of getenv for ANSI and Unicode strings. It is therefore more meaningful to provide a unified overload for the corresponding functions.

Something like this should do the job:

namespace details {
using std::getenv; // char variant
const wchar_t* getenv(const wchar_t* name) {
    return _wgetenv(name);
}
} // namespace internal

template<class Ch, std::size_t N>
std::basic_string<Ch> DefaultLogDir(const std::array<const Ch*, N>& names) {
    for (const Ch* name : names) {
        if ((const Ch* var = details::getenv(name)) != nullptr) {
            return var;
        }
    }
    return {};
}

Now you only need to specialize the literals:

template<class Ch>
struct LogDirEnvVar;

template<>
struct LogDirEnvVar<char> {
    constexpr static std::array<const char*, 2> names() noexcept {
        return {"GOOGLE_LOG_DIR", "TEST_TMPDIR"};
    }
};

template<>
struct LogDirEnvVar<wchar_t> {
    constexpr static std::array<const wchar_t*, 2> names() noexcept {
        return {L"GOOGLE_LOG_DIR", L"TEST_TMPDIR"};
    }
};

Eventually, you can define

template<class Ch>
decltype(auto) DefaultLogDir() {
    return DefaultLogDir(LogDirEnvVar<Ch>::names());
}

and use it as follows:

auto dir1 = DefaultLogDir<char>();
auto dir2 = DefaultLogDir<wchar_t>();

However, I'm not sure this is really necessary here. Can't you just always use the wide string variant here?

Thanks for your suggestions; it's a very nice thing for me. Regarding your question, I don't have much experience with platforms other than Windows, but I will try to do it.

@xfc1939 xfc1939 closed this Jan 29, 2024
@sergiud sergiud reopened this Jan 29, 2024
@sergiud
Copy link
Collaborator

sergiud commented Jan 29, 2024

@xfc1939 Please do not close the PRs to create a new one. Simply update your branch and push the changes. Thank you.

@sergiud sergiud changed the title feature: Support unicode in paths #995 feature: upport unicode in paths Jan 29, 2024
@sergiud sergiud changed the title feature: upport unicode in paths support unicode in paths Jan 29, 2024
@xfc1939 xfc1939 force-pushed the feature/support_unicode_paths branch from 0932ea5 to 7102881 Compare January 30, 2024 15:06
@xfc1939
Copy link
Author

xfc1939 commented Jan 30, 2024

@xfc1939 Please do not close the PRs to create a new one. Simply update your branch and push the changes. Thank you.

all right, I have updated my branch and sumited here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Unicode in paths
3 participants