Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf #1473

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Perf #1473

wants to merge 2 commits into from

Conversation

cgzones
Copy link
Member

@cgzones cgzones commented May 6, 2024

  • Annotate mbstowcs_nonfatal() with restrict

    The arguments to our mbrtowc(3) wrapper should not alias, since they also must not for mbrtowc(3).

    Might help some compilers optimizing the code.

  • Linux: use fast integer parsing for /proc//stat

    /proc//stat is parsed for every displayed process on every iteration containing several numbers to parse.
    Use out own implementations since strto(u)l(3) came up during profiling.

    @BenBE what was the reason for adding the parameter maxlen to those functions?

The arguments to our mbrtowc(3) wrapper should not alias, since they
also must not for mbrtowc(3).

Might help some compilers optimizing the code.
@BenBE BenBE added the enhancement Extension or improvement to existing feature label May 6, 2024
@BenBE
Copy link
Member

BenBE commented May 6, 2024

The reason for the maxlen argument had mostly two aspects:

  • Defense in depth regarding avoiding buffer overruns
  • Unless the input is all zero, every string above a certain length will overflow.

/proc/<pid>/stat is parsed for every displayed process on every
iteration containing several numbers to parse.  Use out own
implementations since strto(u)l(3) came up during profiling.
@@ -78,11 +78,68 @@ static FILE* fopenat(openat_arg_t openatArg, const char* pathname, const char* m
return fp;
}

static long fast_strtol_dec(char** str, int maxlen) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just calling fast_stroull_dec() and downcast the result for this one?

static long fast_strtol_dec(char** str, int maxlen) {
   long long result = fast_strtoll_dec(str, maxlen);
   assert(result <= LONG_MAX);
   assert(result >= LONG_MIN);
   return (long)result;
}

result *= 10;
result += **str - '0';
(*str)++;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "while" loop can be converted to a fast_strtoull_dec(str, maxlen) and then cast the result from unsigned to signed.

static inline uint64_t fast_strtoull_dec(char** str, int maxlen) {
register uint64_t result = 0;

if (!maxlen)
--maxlen;
maxlen = 20; // 18446744073709551615
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting the previous default makes no sense for this line. If at all we want documentation it would explain, that this is "assume max reasonable length if none given". I'm for dropping this comment though …

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, the standard strtoull() permits almost unlimited number of leading zeros in the string. While it won't certainly happen in parsing any Linux procfs file, having an arbitrary length limit in a strtoull() implementation would technically violate ISO C.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is less about ISO C compliance, but more about squeeze out cycles. fast_strtoull is not meant to comply to ISO C, but to parse the data we actually encounter in realistic situations. Otherwise we'd have to support locales and a full load of other stuff (different radix prefixes) we don't need …

}

if (!maxlen)
maxlen = 20; // −9223372036854775808
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If at all, document the why not the magic value …

And the why is basically "because C does not have default arguments" …

}

if (!maxlen)
maxlen = 20; // −9223372036854775808
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here: Drop that comment.

unsigned long result = 0;

if (!maxlen)
maxlen = 20; // 18446744073709551615
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: Drop this comment …

Comment on lines +108 to +112
while (maxlen-- && **str >= '0' && **str <= '9') {
result *= 10;
result += **str - '0';
(*str)++;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same replacement of this while loop could be done here too …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Extension or improvement to existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants