Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](datatype) add BE config to allow zero date #34961

Merged
merged 3 commits into from
May 22, 2024

Conversation

kaijchen
Copy link
Contributor

@kaijchen kaijchen commented May 16, 2024

Proposed changes

Add a BE config allow_zero_date, which defaults to false.

If set to false, 0000-00-00 will be converted to NULL (old behavior).
If set to true, 0000-00-00 will be converted to 0000-01-01.

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

Co-authored-by: Gabriel [email protected]

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@kaijchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

return from_date_str_base(date_str, len, scale, &local_time_zone);
const cctz::time_zone& local_time_zone, int scale /* = -1*/,
bool convert_zero) {
return from_date_str_base(date_str, len, scale, &local_time_zone, convert_zero);
}
template <typename T>
bool DateV2Value<T>::from_date_str_base(const char* date_str, int len, int scale,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'from_date_str_base' has cognitive complexity of 118 (threshold 50) [readability-function-cognitive-complexity]

bool DateV2Value<T>::from_date_str_base(const char* date_str, int len, int scale,
                     ^
Additional context

be/src/vec/runtime/vdatetime_value.cpp:2000: +1, including nesting penalty of 0, nesting level increased to 1

    while (ptr < end && isspace(*ptr)) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2000: +1

    while (ptr < end && isspace(*ptr)) {
                     ^

be/src/vec/runtime/vdatetime_value.cpp:2003: +1, including nesting penalty of 0, nesting level increased to 1

    if (ptr == end || !isdigit(*ptr)) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2003: +1

    if (ptr == end || !isdigit(*ptr)) {
                   ^

be/src/vec/runtime/vdatetime_value.cpp:2008: +1, including nesting penalty of 0, nesting level increased to 1

    while (pos < end && (isdigit(*pos) || *pos == 'T')) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2008: +1

    while (pos < end && (isdigit(*pos) || *pos == 'T')) {
                     ^

be/src/vec/runtime/vdatetime_value.cpp:2008: +1

    while (pos < end && (isdigit(*pos) || *pos == 'T')) {
                                       ^

be/src/vec/runtime/vdatetime_value.cpp:2018: +1, including nesting penalty of 0, nesting level increased to 1

    if (pos == end || *pos == '.' ||
    ^

be/src/vec/runtime/vdatetime_value.cpp:2018: +1

    if (pos == end || *pos == '.' ||
                                  ^

be/src/vec/runtime/vdatetime_value.cpp:2020: +2, including nesting penalty of 1, nesting level increased to 2

        if (digits == 4 || digits == 8 || digits >= 14) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2020: +1

        if (digits == 4 || digits == 8 || digits >= 14) {
                                       ^

be/src/vec/runtime/vdatetime_value.cpp:2022: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/runtime/vdatetime_value.cpp:2032: +1, including nesting penalty of 0, nesting level increased to 1

    while (ptr < end && isdigit(*ptr) && field_idx < MAX_DATE_PARTS) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2032: +1

    while (ptr < end && isdigit(*ptr) && field_idx < MAX_DATE_PARTS) {
                                      ^

be/src/vec/runtime/vdatetime_value.cpp:2035: +1

        bool scan_to_delim = (!is_interval_format) && (field_idx != 6);
                                                   ^

be/src/vec/runtime/vdatetime_value.cpp:2036: +2, including nesting penalty of 1, nesting level increased to 2

        while (ptr < end && isdigit(*ptr) && (scan_to_delim || field_len--)) { // field_len <= 7
        ^

be/src/vec/runtime/vdatetime_value.cpp:2036: +1

        while (ptr < end && isdigit(*ptr) && (scan_to_delim || field_len--)) { // field_len <= 7
                                          ^

be/src/vec/runtime/vdatetime_value.cpp:2036: +1

        while (ptr < end && isdigit(*ptr) && (scan_to_delim || field_len--)) { // field_len <= 7
                                                            ^

be/src/vec/runtime/vdatetime_value.cpp:2041: +2, including nesting penalty of 1, nesting level increased to 2

        if (field_idx == 6) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2042: +3, including nesting penalty of 2, nesting level increased to 3

            if constexpr (is_datetime) {
            ^

be/src/vec/runtime/vdatetime_value.cpp:2047: +4, including nesting penalty of 3, nesting level increased to 4

                if (scale >= 0) {
                ^

be/src/vec/runtime/vdatetime_value.cpp:2058: +5, including nesting penalty of 4, nesting level increased to 5

                    if (reminder >= 5 * normalizer) {
                    ^

be/src/vec/runtime/vdatetime_value.cpp:2063: +5, including nesting penalty of 4, nesting level increased to 5

                    if (temp_val == int_exp10(7)) {
                    ^

be/src/vec/runtime/vdatetime_value.cpp:2066: +1, nesting level increased to 5

                    } else {
                      ^

be/src/vec/runtime/vdatetime_value.cpp:2072: +4, including nesting penalty of 3, nesting level increased to 4

                while (ptr < end && isdigit(*ptr)) {
                ^

be/src/vec/runtime/vdatetime_value.cpp:2072: +1

                while (ptr < end && isdigit(*ptr)) {
                                 ^

be/src/vec/runtime/vdatetime_value.cpp:2075: +1, nesting level increased to 3

            } else {
              ^

be/src/vec/runtime/vdatetime_value.cpp:2083: +2, including nesting penalty of 1, nesting level increased to 2

        if (temp_val > 999999L) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2089: +2, including nesting penalty of 1, nesting level increased to 2

        if (field_idx == 6) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2093: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/runtime/vdatetime_value.cpp:2099: +2, including nesting penalty of 1, nesting level increased to 2

        if (ptr == end) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2105: +2, including nesting penalty of 1, nesting level increased to 2

        if (UNLIKELY((field_idx > 2 ||
        ^

be/src/vec/runtime/vdatetime_value.cpp:2107: +1

                     && time_zone_begins(ptr, end))) {
                     ^

be/src/vec/runtime/vdatetime_value.cpp:2105: +1

        if (UNLIKELY((field_idx > 2 ||
                                    ^

be/src/vec/runtime/vdatetime_value.cpp:2108: +3, including nesting penalty of 2, nesting level increased to 3

            if (local_time_zone == nullptr) {
            ^

be/src/vec/runtime/vdatetime_value.cpp:2111: nesting level increased to 3

            auto get_tz_offset = [&](const std::string& str_tz,
                                 ^

be/src/vec/runtime/vdatetime_value.cpp:2114: +4, including nesting penalty of 3, nesting level increased to 4

                if (!TimezoneUtils::find_cctz_time_zone(str_tz, given_tz)) {
                ^

be/src/vec/runtime/vdatetime_value.cpp:2125: +3, including nesting penalty of 2, nesting level increased to 3

            } catch ([[maybe_unused]] Exception& e) {
              ^

be/src/vec/runtime/vdatetime_value.cpp:2132: +2, including nesting penalty of 1, nesting level increased to 2

        if (field_idx == 2 && *ptr == 'T') {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2132: +1

        if (field_idx == 2 && *ptr == 'T') {
                           ^

be/src/vec/runtime/vdatetime_value.cpp:2140: +2, including nesting penalty of 1, nesting level increased to 2

        if (field_idx == 5) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2141: +3, including nesting penalty of 2, nesting level increased to 3

            if (*ptr == '.') {
            ^

be/src/vec/runtime/vdatetime_value.cpp:2146: +4, including nesting penalty of 3, nesting level increased to 4

                if constexpr (is_datetime) {
                ^

be/src/vec/runtime/vdatetime_value.cpp:2148: +1, nesting level increased to 4

                } else {
                  ^

be/src/vec/runtime/vdatetime_value.cpp:2151: +1, nesting level increased to 3

            } else if (isdigit(*ptr)) {
                   ^

be/src/vec/runtime/vdatetime_value.cpp:2159: +2, including nesting penalty of 1, nesting level increased to 2

        while (ptr < end && (ispunct(*ptr) || isspace(*ptr))) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2159: +1

        while (ptr < end && (ispunct(*ptr) || isspace(*ptr))) {
                         ^

be/src/vec/runtime/vdatetime_value.cpp:2159: +1

        while (ptr < end && (ispunct(*ptr) || isspace(*ptr))) {
                                           ^

be/src/vec/runtime/vdatetime_value.cpp:2160: +3, including nesting penalty of 2, nesting level increased to 3

            if (isspace(*ptr)) {
            ^

be/src/vec/runtime/vdatetime_value.cpp:2161: +4, including nesting penalty of 3, nesting level increased to 4

                if (((1 << field_idx) & allow_space_mask) == 0) {
                ^

be/src/vec/runtime/vdatetime_value.cpp:2165: +3, including nesting penalty of 2, nesting level increased to 3

            if (*ptr == '-') {
            ^

be/src/vec/runtime/vdatetime_value.cpp:2174: +1, including nesting penalty of 0, nesting level increased to 1

    if (!is_interval_format) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2177: +1, including nesting penalty of 0, nesting level increased to 1

    for (; field_idx < MAX_DATE_PARTS; ++field_idx) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2181: +1, including nesting penalty of 0, nesting level increased to 1

    if (year_len == 2) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2182: +2, including nesting penalty of 1, nesting level increased to 2

        if (date_val[0] < YY_PART_YEAR) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2184: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/runtime/vdatetime_value.cpp:2189: +1, including nesting penalty of 0, nesting level increased to 1

    if (num_field < 3) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2192: +1, including nesting penalty of 0, nesting level increased to 1

    if (is_invalid(date_val[0], date_val[1], date_val[2], 0, 0, 0, 0)) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2193: +2, including nesting penalty of 1, nesting level increased to 2

        if (date_val[0] == 0 && date_val[1] == 0 && date_val[2] == 0 && convert_zero) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2193: +1

        if (date_val[0] == 0 && date_val[1] == 0 && date_val[2] == 0 && convert_zero) {
                                                                     ^

be/src/vec/runtime/vdatetime_value.cpp:2196: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/runtime/vdatetime_value.cpp:2205: +1, including nesting penalty of 0, nesting level increased to 1

    if constexpr (!is_datetime) {
    ^

be/src/vec/runtime/vdatetime_value.cpp:2206: +2, including nesting penalty of 1, nesting level increased to 2

        if (sec_offset) {
        ^

be/src/vec/runtime/vdatetime_value.cpp:2208: +3, including nesting penalty of 2, nesting level increased to 3

            if (!tmp.check_range_and_set_time(date_val[0], date_val[1], date_val[2], date_val[3],
            ^

be/src/vec/runtime/vdatetime_value.cpp:2212: +3, including nesting penalty of 2, nesting level increased to 3

            if (!tmp.date_add_interval<TimeUnit::SECOND>(
            ^

be/src/vec/runtime/vdatetime_value.cpp:2221: +1, including nesting penalty of 0, nesting level increased to 1

    if (!check_range_and_set_time(date_val[0], date_val[1], date_val[2], date_val[3], date_val[4],
    ^

be/src/vec/runtime/vdatetime_value.cpp:2226: +1, including nesting penalty of 0, nesting level increased to 1

    return sec_offset ? date_add_interval<TimeUnit::SECOND>(
                      ^

dataroaring
dataroaring previously approved these changes May 16, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 16, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 16, 2024
@kaijchen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.71% (9005/25219)
Line Coverage: 27.37% (74499/272154)
Region Coverage: 26.62% (38517/144718)
Branch Coverage: 23.44% (19652/83846)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4ef1fb254624593f88995a4ad28b18133653a656_4ef1fb254624593f88995a4ad28b18133653a656/report/index.html

@liaoxin01
Copy link
Contributor

Please add a regression test.

@dataroaring
Copy link
Contributor

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.71% (9009/25228)
Line Coverage: 27.37% (74523/272254)
Region Coverage: 26.63% (38556/144805)
Branch Coverage: 23.45% (19675/83894)
Coverage Report: http://coverage.selectdb-in.cc/coverage/73b64113eaa9aa7687cf9ff43ae6175f17c1a68d_73b64113eaa9aa7687cf9ff43ae6175f17c1a68d/report/index.html

@dataroaring
Copy link
Contributor

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.70% (9014/25250)
Line Coverage: 27.35% (74544/272527)
Region Coverage: 26.61% (38562/144923)
Branch Coverage: 23.44% (19672/83938)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ff5a83e0b5372189aa01c32616fd6b1a01e26d45_ff5a83e0b5372189aa01c32616fd6b1a01e26d45/report/index.html

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 21, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 9537c07 into apache:master May 22, 2024
25 of 28 checks passed
yiguolei pushed a commit that referenced this pull request May 23, 2024
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 24, 2024
dataroaring pushed a commit that referenced this pull request May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants