Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus "invalid metric name or label names" error while scraping cassandra metrics #1563

Open
okgolove opened this issue Jan 17, 2023 · 0 comments
Labels
bug Something isn't working help-wanted Extra attention is needed needs-triage

Comments

@okgolove
Copy link

Bug Report

Describe the bug
Prometheus unable to scrape cassandra metrics

To Reproduce
Steps to reproduce the behavior:

  1. Deploy kube-prometheus-stack helm chart
prometheusOperator:
  namespaces:
    releaseNamespace: true
  serviceMonitor:
    selfMonitor: false

prometheus:
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelector: {}
    serviceMonitorNamespaceSelector: {}
    scrapeInterval: 15s
    evaluationInterval: 1m
    retention: 1d
    storageSpec:
      volumeClaimTemplate:
        spec:
          accessModes: ["ReadWriteOnce"]
          storageClassName: gp3
          resources:
            requests:
              storage: 20Gi
    resources:
      requests:
        cpu: 200m
        memory: 2Gi
      limits:
        cpu: 1
        memory: 4Gi
  serviceMonitor:
    selfMonitor: false

grafana:
  enabled: true
  adminUser: admin
  adminPassword: ****
  serviceMonitor:
    enabled: false
  defaultDashboardsEnabled: false
  ingress:
    enabled: false
  sidecar:
    datasources:
      enabled: true
      defaultDatasourceEnabled: true

    dashboards:
      enabled: true
      label: grafana_dashboard
      labelValue: 1
      resource: configmap
      searchNamespace: ALL
      folder: /tmp/dashboards
      folderAnnotation: grafana_folder
      provider:
        foldersFromFilesStructure: true
  plugins:
    - grafana-polystat-panel
  grafana.ini: {}

coreDns:
  enabled: false
kubeApiServer:
  enabled: false
kubeControllerManager:
  enabled: false
kubeDns:
  enabled: false
kubeEtcd:
  enabled: false
kubeProxy:
  enabled: false
kubeScheduler:
  enabled: false
kubeStateMetrics:
  enabled: false
kubelet:
  enabled: false
nodeExporter:
  enabled: false
alertmanager:
  enabled: false
  serviceMonitor:
    selfMonitor: false
  1. Create k8ssandra cluster:
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo
spec:
  cassandra:
    serverVersion: "4.0.3"
    serverImage: k8ssandra/cass-management-api:4.0.3
    telemetry:
      prometheus:
        enabled: true
    resources:
      requests:
        memory: 1Gi
        cpu: 300m
      limits:
        memory: 6Gi
        cpu: 1
    storageConfig:
      cassandraDataVolumeClaimSpec:
        storageClassName: gp3
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 150Gi
    config:
      jvmOptions:
        heapSize: 1Gi
    datacenters:
      - metadata:
          name: dc1
        size: 9
        racks:
          - name: r1
            nodeAffinityLabels:
              topology.kubernetes.io/zone: eu-north-1a
          - name: r2
            nodeAffinityLabels:
              topology.kubernetes.io/zone: eu-north-1b
          - name: r3
            nodeAffinityLabels:
              topology.kubernetes.io/zone: eu-north-1c

    mgmtAPIHeap: 64Mi
  1. Wait until cluster is ready and prometheus-operator servicemonitor was created
  2. Open Prometheus targets
  3. See error
invalid metric name or label names: {__name__="mcac_hints_hints_created.10.5.20.132.7000_total", cassandra_datastax_com_cluster="demo", cassandra_datastax_com_datacenter="dc1", cluster="demo", container="cassandra", dc="dc1", endpoint="prometheus", exported_instance="10.5.15.192", instance="10.5.15.192:9103", job="demo-dc1-all-pods-service", mcac="org.apache.cassandra.metrics.hints_service.hints_created.10.5.20.132.7000", mcac_filtered="true", namespace="k8ssandra-operator", pod="demo-dc1-r1-sts-2", rack="r1", service="demo-dc1-all-pods-service"}

Expected behavior
All targets are available, metrics are stored in Prometheus

Screenshots

Environment (please complete the following information):

  • Helm charts version info
k8ssandra-operator              k8ssandra-operator      1               2023-01-05 15:21:00.190994 +0200 EET    deployed        k8ssandra-operator-0.39.1               1.4.0
kube-prometheus-stack           k8ssandra-operator      1              2023-01-17 09:48:13.246841 +0200 EET    deployed        kube-prometheus-stack-44.1.0            v0.62.0
  • Helm charts user-supplied values
cass-operator:
  admissionWebhooks:
    enabled: false
  cassandra:
    cassandraLibDirVolume:
      size: 20Gi
      storageClass: gp3
    datacenters:
    - name: dc1
      racks:
      - affinityLabels:
          topology.kubernetes.io/zone: eu-north-1a
        name: eu-north-1a
      - affinityLabels:
          topology.kubernetes.io/zone: eu-north-1b
        name: eu-north-1b
      - affinityLabels:
          topology.kubernetes.io/zone: eu-north-1c
        name: eu-north-1c
      size: 4
    heap:
      newGenSize: 512Mi
      size: 512Mi
    resources:
      limits:
        cpu: 2
        memory: 2Gi
      requests:
        cpu: 1
        memory: 1Gi
  • Kubernetes version information:
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.7-eks-fb459a0", GitCommit:"c240013134c03a740781ffa1436ba2688b50b494", GitTreeState:"clean", BuildDate:"2022-10-24T20:36:26Z", GoVersion:"go1.18.7", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:
    EKS

Additional context

@okgolove okgolove added bug Something isn't working needs-triage labels Jan 17, 2023
@adejanovski adejanovski added the help-wanted Extra attention is needed label Jan 24, 2024
RomainAnselin added a commit to RomainAnselin/k8ssandra that referenced this issue Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help-wanted Extra attention is needed needs-triage
Projects
Archived in project
Development

No branches or pull requests

2 participants