Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration option to send Buffer in newest-to-oldest order #15208

Open
kj4tmp opened this issue Apr 22, 2024 · 4 comments
Open

Configuration option to send Buffer in newest-to-oldest order #15208

kj4tmp opened this issue Apr 22, 2024 · 4 comments
Labels
feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution size/m 2-4 day effort

Comments

@kj4tmp
Copy link

kj4tmp commented Apr 22, 2024

Use Case

The telegraf buffer is sometimes used to temporarily accommodate high write rates to an influxdb instance.

Sometimes it is preferred to see the newest data in influxdb first and backfill the older data until the production rate decreases.

Expected behavior

I expected to see a configuration option for the buffer to allow it to flush the newest data first instead of the oldest data.

Actual behavior

Only the oldest data is flushed first.

Additional info

No response

@kj4tmp kj4tmp added the feature request Requests for new plugin and for new features to existing plugins label Apr 22, 2024
@kj4tmp kj4tmp changed the title Configuration to send Buffer in newests to oldest order Configuration option to send Buffer in newest-to-oldest order Apr 22, 2024
@powersj
Copy link
Contributor

powersj commented Apr 22, 2024

Hi,

The telegraf buffer is sometimes used to temporarily accommodate high write rates to an influxdb instance.

How long is your output down such that you actually see the impact of FIFO occur? Can you provide some logs giving an example? I am initially hesitant to treat the buffer like a stack versus a queue without understanding if your stack would ever flush in your situation.

fwiw, the running outputs will request a batch from the buffer, and then the buffer returns a batch size or less slice of metrics to write.

@powersj powersj added the waiting for response waiting for response from contributor label Apr 22, 2024
@kj4tmp
Copy link
Author

kj4tmp commented Apr 23, 2024

Our application requires accommodating bursts of writes exceeding 1 million field writes / second (which is the typical limit of a single node influxdb OSS instance) for up to 20 minutes, while also reporting live down-sampled data. This can be accomplished with a telegraf buffer of about 30 GB (on our machine).

The FIFO-ness of the telegraf buffer means if we do nothing, the data in influx will be about 10 minutes behind by the end of the 20 minute burst, violating the live down-sampled data reporting requirement.

One option available to satisfy this live + buffering requirement is to use multiple influxdb output plugins and pipe the live downsampled data to a dedicated output plugin instance using some tag-based routing with aggregator.final and a processor to strip the final back off the aggregator output

###############################################################################
#                       AGGREGATOR PLUGINS                                    #
###############################################################################

# 10 Hz Aggregator
[[aggregators.final]]
  # alias = "10-Hz-Aggregator"
  ## The period on which to flush & clear the aggregator.
  period = "0.1s"

  ## If true, the original metric will be dropped by the
  ## aggregator and will not get sent to the output plugins.
  # drop_original = false

  ## The time that a series is not updated until considering it final. Ignored
  ## when output_strategy is "periodic".
  # series_timeout = "5m"

  ## Output strategy, supported values:
  ##   timeout  -- output a metric if no new input arrived for `series_timeout`
  ##   periodic -- output the last received metric every `period`
  output_strategy = "periodic"
  [aggregators.final.tags]
    aggregator = "10-Hz-Aggregator"

###############################################################################
#                       PROCESSOR PLUGINS                                     #
###############################################################################

# Trim the _final stuff from the aggregator plugin
[[processors.regex]]
  ## Other configurations for the processor...

  ## Rename metric fields to strip "_final" suffix
  [[processors.regex.field_rename]]
    ## Regular expression to match on the field name
    pattern = "(.*)_final$"
    ## Replacement expression defining the name of the new field
    replacement = "${1}"

  [processors.regex.tagpass]
    aggregator = ["10-Hz-Aggregator"]

the solution could be simpler if the FIFO buffer could be configured as LIFO. Though it may have some un-forseen consequences for users who rely on the "last-write wins" aspect of how influxdb handles duplicate data.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Apr 23, 2024
@kj4tmp
Copy link
Author

kj4tmp commented Apr 24, 2024

here is a relevant issue
#5633

@powersj
Copy link
Contributor

powersj commented Apr 25, 2024

Thanks for the background it does help to understand the situation and desire for this better. My initial concerns are two fold:

First, that some output simply do not work with sending newer data first. For example, stackdriver absolutely requires data to be sent in order, oldest to newest. We have had issues come up in the past where this was not occurring. Adding an option like this, while opt-in, would prevent the usage of certain plugins and we would need to make that very clear somehow.

Second, I do have a concern with using the buffer like a stack, in that it is possible some metrics would get lost or sit in the stack whereas the queue ensures nothing stays in there forever.

That said, because this is opt-in, I am not opposed to adding some sort of option to allow this behavior. We are currently working on a rework of the buffer implementation to allow us to write to files and not just in-memory. I would want to make this sort of change after that work is landed.

next steps: continue work on buffer implementation, get initial re-structure in place, and look to add an option in how the buffer is read from.

@powersj powersj added help wanted Request for community participation, code, contribution size/m 2-4 day effort labels Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins help wanted Request for community participation, code, contribution size/m 2-4 day effort
Projects
None yet
Development

No branches or pull requests

2 participants