Squiggler is our solution for the automation of creation and migration of indices to the rollover pattern. It's not perfect, but it'll do.
This project was originally for ES 6/7.x, and as such has some odd decisions as new features came and went within ES and Kibana. Some things are bolted on, and large portions of this project are no longer used. It has its issues, but for a time it worked great, still more or less does.
Squiggler does the following (examples follow the fictitious some-new-index
):
-
Documents are pushed to a new index that has no rollover alias associated via Logstash, in this case it's
some-new-index
. The index will be perceived as new if:- There's no alias for
some-new-index
to a rollover index. In this case it really is new. - Even if the alias previously existed, if the index it pointed to was closed (due to age) or deleted the alias will be deleted along with it.
- There's no alias for
-
Looks for new indices;
some-new-index
is found.- Filters the new indices found against our exclusion parameters.
- Filters out anything that is actively being used in a current alias (e.g.
cat => cat-2018.08.14-000001
is left alone). - Filters out anything that looks like it used to be part of a rollover index (by regex, e.g.
dog-2018.08.14-000299
is left alone). - Filters out anything that appears to be date indexed, such as monthly/daily/hourly patterns like
dog-2018.08
,cat-2018.08.14
. - Filters out anything with a preceding dot
.
, like.kibana
,.tasks
, etc. These are generally used internally by Elasticsearch or special in their purpose.
-
For each index, a job is spawned that ...
- Creates date math indices for new patterns, in the form of
<some-new-index-{now/d}-000001>
. - Starts a reindex job from the original index
some-new-index
to the date math index ofsome-new-index-{now/d}-000001
. - After the reindex is completed, an atomic request to the aliases API is made that removes the old index
some-new-index
first, and adds a new alias ofsome-new-index
to the rollover indexsome-new-index-2018.08.14-000001
.
- Creates date math indices for new patterns, in the form of
-
Looks for rollable aliases. Submits rollover request based on the
max_age
,max_size
parameters given. The defaults will roll aliases at the 30 day mark and when they reach a 10GiB per shard mark. -
(If enabled) Looks at Kibana's API for index patterns, grabs the list of rollover aliases. Filtering the aliases of anything that is already in Kibana.
-
(If enabled) Then creates individual patterns via Kibana's API, optionally notifying a Slack webhook with the list of aliases.
- Note: technically new rollover indices that are created that already exist as an index pattern in the remote/local Kibana will not re-notify until the index pattern is missing in the destination. This is different than the older Squiggler behavior where it would notify every time a new rollover index was made which could duplicate anytime an index was closed, remade and recreated as a rollover index.
All actions can be selectively enabled or disabled, with unique exclusion parameters at each step in case you want to separate the duties to different cronjobs.
Squiggler is primarily for logging data. It is not ideal for production data due to how it moves data from the old index to the new rollover index. Since there's no nice way within Elasticsearch to cut over data to a rollover pattern, there will likely be some documents lost when reindexing from the original index to the new. This is relatively fine for logging data where it's acceptable we might not have everything nor need everything.
Rollover indices are greatly useful for controlling the number of shards and the size of the indices. This works great for logging data due to how frequent and infrequent some indices are over others. Indices that only receive ~50GiB of data per month do not need a new index for every hour every few days they receive new data. This leads to a build of incredibly small indices with so many shards. Compound this issue with many unique indices with the same issue and you will run out of heap space to store the metadata of them all.
This is not acceptable for production data where you need everything. Therefore, for production type use, it will be preferable to use timestamped indices based on the month/day/hour, potentially reindexing as a job every so often if you want to control the number of indices instead.
The above guidance is no longer an issue with the advent of some modifications to the Logstash output to establish ILM indices directly. As well as the introduction of data streams. Squiggler can still be used to create and dump index patterns with Kibana with these advances. You can see the example changes here in the ES output for logstash. Effectively this coupled with key actions within Squiggler allow ILM indices to spawn without the need to do a reindex and relink. Squiggler would more or less need to "fix" the indices depending on the settings you use in the output.
usage: squiggler.py [-h] [--for-real] [--verbose] config
positional arguments:
config
optional arguments:
-h, --help show this help message and exit
--for-real turn off dry run
--verbose verbose logging
- If you're specifying
remote_kibana
,elasticsearch
must be filled out (duh) so it can find and setup remote index patterns properly. - During parsing of the config on startup, if
elasticsearch
,local_kibama
orremote_kibana
are not accessible on first startup (if they're used) then they will raise an exception and Squiggler will fail. - Exceptions during individual actions should not affect other actions.
See sample.yaml for sample configuration and default field values.
Without verbose logging you get just a summary of the actions taken and their durations. The idea being this is what would be logged on output and ingested to Elasticsearch for aggregations. The timings and a status indicator are also pushed to Statsd (if specified in outputs).
{
"action_durations": {
"create_remote_kibana_patterns": 0.8102409839630127,
"create_rollover_indices": 0.7224798202514648,
"update_rollover_aliases": 0.5127968788146973
},
"event": "finished",
"level": "info",
"logger": "squiggler",
"timestamp": "2018-08-17T18:16:59.584366Z",
"total_duration": 2.045517683029175
}