Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid AlertmanagerConfig passes validation and results in 'group_interval cannot be zero' errors #6594

Open
1 task done
leonardo-zorzi opened this issue May 15, 2024 · 6 comments · May be fixed by #6646
Open
1 task done

Comments

@leonardo-zorzi
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Description

AlertmanagerConfig CR with group_interval set to 0 passes validation and results in errors on the alertmanager instances itself.
ts=2024-05-15T12:04:48.725Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="group_interval cannot be zero"

Steps to Reproduce

spec:
  route:
      groupBy: [ "..." ]
      groupWait: 0s
      groupInterval: 0s
      repeatInterval: 24h
      receiver: "a_receiver"
      routes:
        - matchers:
            - name: "alertname"
              value: "InfoInhibitor"
          receiver: "null"

I run the Operator on v0.67.1 (unfortunately the latest kube-prometheus) but the current master CRD seems to references still the same regex.

Expected Result

CR fails validation.

Actual Result

CR passed validation and gets added to the final config file.

Prometheus Operator Version

v0.67.1

Kubernetes Version

1.27

Kubernetes Cluster Type

GKE

How did you deploy Prometheus-Operator?

Other (please comment)

Manifests

kube-prometheus jsonnet

prometheus-operator log output

`ts=2024-05-15T12:04:48.725Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config_out/alertmanager.env.yaml err="group_interval cannot be zero"`

Anything else?

No response

@leonardo-zorzi leonardo-zorzi added kind/bug needs-triage Issues that haven't been triaged yet labels May 15, 2024
@codeknight03
Copy link
Contributor

codeknight03 commented May 15, 2024

This one looks very similar to #6532. I have a PR open to fix this #6585.

I'll try to see if it is a similar case and if it is then I can take this up with another PR.

@codeknight03
Copy link
Contributor

I missed that it is being validated unlike the case I referenced above but the regex being used is allowing 0 to be taken. The regex is pretty much used for all duration values. Might need to check for what all values the same case might occur.

if r.GroupInterval != "" && !durationRe.MatchString(r.GroupInterval) {
return fmt.Errorf("groupInterval %s does not match required regex: %s", r.GroupInterval, durationRe.String())
}
if r.GroupWait != "" && !durationRe.MatchString(r.GroupWait) {
return fmt.Errorf("groupWait %s does not match required regex: %s", r.GroupWait, durationRe.String())
}
if r.RepeatInterval != "" && !durationRe.MatchString(r.RepeatInterval) {
return fmt.Errorf("repeatInterval %s does not match required regex: %s", r.RepeatInterval, durationRe.String())
}

@simonpasquier simonpasquier removed the needs-triage Issues that haven't been triaged yet label May 21, 2024
@simonpasquier
Copy link
Contributor

Thanks for the report! We'd need a different regex for durations that can't be zero.

@ArthurSens
Copy link
Member

The fix would be in this regex:

// +kubebuilder:validation:Pattern:="^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"

A test case showing that we reject configs that us 0s as duration would also be good :)

@codeknight03
Copy link
Contributor

Will be adding a PR ASAP.

@simonpasquier
Copy link
Contributor

To add to what @ArthurSens said and to expand my earlier comment, the existing pattern is correct for most duration fields except for groupInterval and repeatInterval which don't allow "zero" values. Hence we'd need to use a different type with a pattern that rejects "zero" durations (though it may be impossible to craft a regex covering all edge cases).

alizademhdi added a commit to alizademhdi/prometheus-operator that referenced this issue Jun 4, 2024
…groupInterval.

the existing pattern is correct for most duration fields except for groupInterval and repeatInterval which don't allow "zero" values. In this commitwe define new pattern that rejects "zero" durations.

Fixes prometheus-operator#6594

Signed-off-by: alizademhdi <[email protected]>
alizademhdi added a commit to alizademhdi/prometheus-operator that referenced this issue Jun 4, 2024
…groupInterval.

the existing pattern is correct for most duration fields except for groupInterval and repeatInterval which don't allow "zero" values. In this commit we define new pattern that rejects "zero" durations.

Fixes prometheus-operator#6594

Signed-off-by: alizademhdi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants