Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster print time (< 10ns) #2931

Closed
Theramar opened this issue Nov 5, 2023 · 6 comments
Closed

faster print time (< 10ns) #2931

Theramar opened this issue Nov 5, 2023 · 6 comments

Comments

@Theramar
Copy link

Theramar commented Nov 5, 2023

Taken from fmtlog (emphasis mine)

The results on a linux server with "Intel(R) Xeon(R) Gold 6144 CPU @ 3.50GHz" is:

Message fmtlog Nanolog spdlog
staticString 6.4 ns, 7.08 M/s 6.5 ns, 33.10 M/s 156.4 ns, 6.37 M/s
stringConcat 6.4 ns, 6.05 M/s 7.5 ns, 14.20 M/s 209.4 ns, 4.77 M/s
singleInteger 6.3 ns, 6.22 M/s 6.5 ns, 50.29 M/s 202.3 ns, 4.94 M/s
twoIntegers 6.4 ns, 4.87 M/s 6.6 ns, 39.25 M/s 257.2 ns, 3.89 M/s
singleDouble 6.2 ns, 5.37 M/s 6.5 ns, 39.62 M/s 225.0 ns, 4.44 M/s
complexFormat 6.4 ns, 2.95 M/s 6.7 ns, 24.30 M/s 390.9 ns, 2.56 M/s

Note that the throughput of Nanolog is not comparable here because it outputs to binary log file instead of human-readable text format, e.g. it saves an int64 timestamp instead of a long formatted date time string.

How can fmtlog achieve such low and stable latency? Two key optimization techniques are employed inspired by Nanolog:

One is allocating a single producer single consumer queue for each logging thread, and have the background thread polling for all these queues. This avoids threads contention and performance will not deteriorate when thread number increases. The queue is automatically created on the first log msg of a thread, so queue is not created for threads that don't use fmtlog. The thread queue has a default size of 1 MB(can be changed by macro FMTLOG_QUEUE_SIZE), and it takes a little time to allocate the queue. It's recommended that user actively calls fmt::preallocate() once the thread is created, so even the first log can have low latency.

What happens when the queue is full? By default, fmtlog simply dump addtional log msgs and return. Alternatively, front-end logging can be blocked while the queue is full by defining macro FMTLOG_BLOCK=1, then no log will be missing. User can register a callback function when log queue is full by fmtlog::setLogQFullCB(cb, userData), by which user can be aware that the consumer(polling thread) is not keeping up. Normally, the queue being full is seldom a problem, but incautious user could leave log statements that are invoked in an unexpected high frequency, e.g. a tcp client spamming with "Connection refused" errors without a connection retry delay. To handle this problem in an elegant way, fmtlog provides a log macro which limits frequency of this log: FMTLOG_LIMIT and 4 shortcuts logil, logwl and logel respectively, user needs to pass the mininum interval in nanosecond as the first argument, e.g.

logil(1e9, "this log will be displayed at most once per second");

The other optimization is that static information of a log(such as format string, log level and location) is saved in a table at its first call, and fmtlog simply pushes the index of the static info table entry with dynamic arguments to the queue, minimizing the msg size. In addition, fmtlog defines a decoding function for each log statment, which is invoked in fmtlog::poll()when the log msg is popped from the queue.

However, these decoding functions bloat program size with each function consuming around 50 bytes. In addition, the static infomation entry also consumes 50-ish bytes runtime memory for each log statement. Such memory overhead may not be worthwhile for those infrequent and latency insensitive logs(e.g. program initialization info), thus fmtlog provides user with another log macro which disables this optimization: FMTLOG_ONCE and of couse shortcuts: logdo, logio, logwoand logeo. FMTLOG_ONCE will not create a static info table entry, nor add a decoding function: it pushes static info along with formatted msg body onto the queue. Note that passing argument by pointer is not supported by `FMTLOG_ONCE`.

For those who prefer to further optimize memory usage by filtering log at compile time, macro FMTLOG_ACTIVE_LEVEL is applied with a default value FMTLOG_LEVEL_INF, meaning debug logs will simply be discarded at compile time. Note that FMTLOG_ACTIVE_LEVEL only applies to log shortcut macros, e.g. logi, but not FMTLOG. Similarly, runtime log level filtering can be disabled by defining macro FMTLOG_NO_CHECK_LEVEL, which will increase performance and reduce generated code size a bit.

Is there a way in spdlog to achieve similar speed? Or at least access the same set of features.

@gabime
Copy link
Owner

gabime commented Nov 5, 2023

spdlog async is much slower than synchronous spdlog. It is interesting to compare to the synchronous version of spdlog.

@Theramar
Copy link
Author

Theramar commented Nov 5, 2023

Surely it would be nice however IIRC spdlog does not offer an binary encoding/decoding (see Nanolog) mechanism which can increase performance

@Theramar
Copy link
Author

Theramar commented Nov 6, 2023

@gabime the benchmark shows synchronous spdlog, see MengRao/fmtlog#84

@Theramar
Copy link
Author

Theramar commented Nov 8, 2023

any updates?

@tt4g
Copy link
Contributor

tt4g commented Nov 8, 2023

I am not familiar with how fmtlog works, but it does not appear to support enhancements by developers.
spdlog is extensible in that developers can define subclasses of sink.
Instead, there is the overhead of calling virtual functions at runtime, which can have a significant impact on print speed (and thus limits compile-time optimization).

Most of fmtlog's optimizations are not likely to be imported into spdlog, but a PR that improves print speed would be welcome.

@hanzz2007
Copy link

hanzz2007 commented Nov 29, 2023

image

seems fmtlog compare with sync spdlog while itself use async mode inside. So that is unfair

@gabime gabime closed this as completed May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants