You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current relabeling actions don't allow blocking label values patterns from appearing in ANY (as in - not known beforehand) label. As high cardinality label values may appear on any arbitrary label which may also change over time, we want to define some well-known value patterns, that we don't want to ingest regardless of what label it appears on.
Currently we configure drop rules with source_labels where we already seen the pattern appear on - in some cases, already causing series cardinality explosion that brings down Prometheus server in the matter of minutes. Those rules do not guard us from pattern appearing in any other labels (both currently existing, and not existing yet).
Over last few years, we could've prevented multiple Prometheus meltdowns with it, as we noticed some repeated patterns (epoch sized numbers/timestamps, IDs, docker random hostnames, email addresses, etc.).
Solution
Have a new drop-like action that does not require to specify source_labels - the pattern should be matched with all sample labels and dropping the sample if any of them matches it.
Example metric relabel config:
# We don't want to see any epoch sized, or larger number in any label
- regex: '.*\d{10,}.*'action: dropifany# We don't want to see any of our typical IDs
- regex: '.*[[:alpha:]]{3}_[[:xdigit:]]{12}.*'action: dropifany
Obviously, above is not optimized, but should illustrate the feature well enough. I've already prototyped this solution and it is working on my simple tests.
Alternatives/expansions
Expand drop action behavior if source_labels list is empty, instead of introducing new action
Use source_labels as exclusion list - "If any label not on the source_labels list matches the pattern -> drop the sample"
might be useful for use cases like: block large numbers, but exclude histogram le label from drop rule
The text was updated successfully, but these errors were encountered:
Hello from the bug scrub: we're open to the idea, however could be a foot-gun and drop too much, thus clear documentation must be a requirement. Also the naming and reuse of the action and source_labels needs more discussion. We should start with a design doc / proposal . Said design doc could look at alternatives and related requirements #13664#12483 .
Proposal
Problem
Current relabeling actions don't allow blocking label values patterns from appearing in ANY (as in - not known beforehand) label. As high cardinality label values may appear on any arbitrary label which may also change over time, we want to define some well-known value patterns, that we don't want to ingest regardless of what label it appears on.
Currently we configure
drop
rules withsource_labels
where we already seen the pattern appear on - in some cases, already causing series cardinality explosion that brings down Prometheus server in the matter of minutes. Those rules do not guard us from pattern appearing in any other labels (both currently existing, and not existing yet).Over last few years, we could've prevented multiple Prometheus meltdowns with it, as we noticed some repeated patterns (epoch sized numbers/timestamps, IDs, docker random hostnames, email addresses, etc.).
Solution
Have a new
drop
-like action that does not require to specifysource_labels
- the pattern should be matched with all sample labels and dropping the sample if any of them matches it.Example metric relabel config:
The code in this function might look like below:
Obviously, above is not optimized, but should illustrate the feature well enough. I've already prototyped this solution and it is working on my simple tests.
Alternatives/expansions
drop
action behavior ifsource_labels
list is empty, instead of introducing new actionsource_labels
as exclusion list - "If any label not on thesource_labels
list matches the pattern -> drop the sample"le
label from drop ruleThe text was updated successfully, but these errors were encountered: