Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/rabbitmq-cluster-operator] metrics-endpoint serviceMonitor references missing port #25743

Open
tolleiv opened this issue May 13, 2024 · 6 comments · May be fixed by #26601
Open

[bitnami/rabbitmq-cluster-operator] metrics-endpoint serviceMonitor references missing port #25743

tolleiv opened this issue May 13, 2024 · 6 comments · May be fixed by #26601
Assignees
Labels
rabbitmq-cluster-operator stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@tolleiv
Copy link

tolleiv commented May 13, 2024

Name and Version

bitnami/rabbitmq-cluster-operator 4.2.8

What architecture are you using?

amd64

What steps will reproduce the bug?

Release the chart with it's default values and monitoring enabled.

helm template rabbitmq-cluster-operator "oci://registry-1.docker.io/bitnamicharts/rabbitmq-cluster-operator" --version 4.2.8 --set clusterOperator.metrics.serviceMonitor.enabled=true --set clusterOperator.metrics.service.enabled=true --namespace=demo

The rendered resources include a ServiceMonitor with two endpoints:

---
# Source: rabbitmq-cluster-operator/templates/cluster-operator/servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: rabbitmq-cluster-operator
    ...
spec:
  ...
  endpoints:
    - port: http
    - port: metrics
---

where the related Deployment only has a single port:

---
# Source: rabbitmq-cluster-operator/templates/cluster-operator/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rabbitmq-cluster-operator
  ...
spec:
    ...
    spec:
      ...
      containers:
        - name: rabbitmq-cluster-operator
          ...
          ports:
            - name: http
              containerPort: 9782
              protocol: TCP
---

Are you using any custom parameters or values?

clusterOperator.metrics.serviceMonitor.enabled=true
clusterOperator.metrics.service.enabled=true

What is the expected behavior?

The ServiceMonitor should only reference existing ports.

What do you see instead?

The metrics Port is part of the ServiceMonitor without being present on the Deployment.

Additional information

The second port was introduce in dceeb50 - maybe a feature flag should be added or some more documentation.

@tolleiv tolleiv added the tech-issues The user has a technical issue about an application label May 13, 2024
@github-actions github-actions bot added the triage Triage is needed label May 13, 2024
@tolleiv
Copy link
Author

tolleiv commented May 14, 2024

Maybe also the author @basit9958 of the change dceeb50 could add their two cents on this one.

Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label May 30, 2024
@tolleiv
Copy link
Author

tolleiv commented Jun 2, 2024

Before this get's stale, I'll try to come up with a pull request in the next days. Would be great to get input on this one beforehand.

@javsalgar
Copy link
Contributor

Hi,

I believe that the metrics endpoint references the RabbitMQ instances created by the operator. It's not clear to me whether the selector actually matches the created instances.

tolleiv added a commit to tolleiv/bitnami-charts that referenced this issue Jun 3, 2024
@tolleiv
Copy link
Author

tolleiv commented Jun 3, 2024

Hi,

I believe that the metrics endpoint references the RabbitMQ instances created by the operator. It's not clear to me whether the selector actually matches the created instances.

The RabbitMQ instances do not have metrics endpoints either, those are named prometheus -> https://github.com/rabbitmq/cluster-operator/blob/HEAD/internal/resource/service.go#L240 - not sure what the initial idea behind the change was.

@carrodher
Copy link
Member

Thank you for opening this issue and submitting the associated Pull Request. Our team will review and provide feedback. Once the PR is merged, the issue will automatically close.

Your contribution is greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rabbitmq-cluster-operator stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants