Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add URL config behavior spec #15321

Merged
merged 2 commits into from
May 16, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
75 changes: 75 additions & 0 deletions docs/specs/tsd-007-url-config-behavior.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# URL-Based Config Behavior

## Objective

Define the retry and reload behavior of remote URLs that are passed as config to
Telegraf. In terms of retry, currently Telegraf will attempt to load a remote
URL three times and then exit. In terms of reload, Telegraf does not have the
capability to reload remote URL based configs. This spec seeks to allow for
options for the user to further these capabilities.

## Keywords

config, error, retry, reload

## Overview

Telegraf allows for loading configurations from local files, directories, and
files via a URL. In order to allow situations where a configuration file is not
yet available or due to a flaky network, the first proposal is to introduce a
new CLI flag: `--url-config-retry-attempts`. This flag would continue to default
to three and would specify the number of retries to attempt to get a remote URL
during the initial startup of Telegraf.

```sh
--config-url-retry-attempts=3 Number of times to attempt to obtain a remote
configuration via a URL during startup. Set to
-1 for unlimited attempts.
```

These attempts would block Telegraf from starting up completely until success or
until we have run out of attempts and exit.

Once Telegraf is up and running, users can use the `--watch` flag to enable
watching local files for changes and if/when changes are made, then reload
Telegraf with the new configuration. For remote URLs, I propose a new CLI flag:
`--url-config-check-interval`. This flag would set an internal timer that when
it goes off, would check for an update to a remote URL file.

```sh
--config-url-watch-interval=0s Time duration to check for updates to URL based
configuration files. Disabled by default.
```

At each interval, Telegraf would send an HTTP HEAD request to the configuration
URL, here is an example curl HEAD request and output:

```sh
$ curl --head http://localhost:8000/config.toml
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.12.3
Date: Mon, 29 Apr 2024 18:18:56 GMT
Content-type: application/octet-stream
Content-Length: 1336
Last-Modified: Mon, 29 Apr 2024 11:44:19 GMT
```

The proposal then is to store the last-modified value when we first obtain the
file and compare the value at each interval. No need to parse the value, just
store the raw string. If there is a difference, trigger a reload.
srebhan marked this conversation as resolved.
Show resolved Hide resolved

If anything other than 2xx response code is returned from the HEAD request,
Telegraf would print a warning message and retry at the next interval. Telegraf
will continue to run the existing configuration with no change.

If the value of last-modified is empty, while very unlikely, then Telegraf would
ignore this configuration file. Telegraf will print a warning message once about
the missing field.

## Relevant Issues

* Configuration capabilities to retry for loading config via URL #[8854][]
* Telegraf reloads URL-based/remote config on a specified interval #[8730][]

[8854]: https://github.com/influxdata/telegraf/issues/8854
[8730]: https://github.com/influxdata/telegraf/issues/8730