Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ProbeMonitoring #91

Closed
dsaintilma-flinks opened this issue Nov 26, 2021 · 13 comments · May be fixed by #766
Closed

Support for ProbeMonitoring #91

dsaintilma-flinks opened this issue Nov 26, 2021 · 13 comments · May be fixed by #766
Assignees
Labels
enhancement New feature or request

Comments

@dsaintilma-flinks
Copy link

Hello,

I was wondering if adding an API endpoint for scraping Probe for exporters like blackbox or ssl is planned?

That's all.
Thanks!

@fabxc
Copy link
Contributor

fabxc commented Dec 2, 2021

We have definitely thought about that use case and want to enable it. Though we don't have a concrete design yet on how to integrate that with the current CRDs.

Our recommendation in the meantime is to run the GMP Prometheus binary as a sidecar to the blackbox_exporter with a static config that only scrapes the blackbox exporter.

@pintohutch pintohutch added the enhancement New feature or request label Apr 5, 2022
@masterlittle
Copy link

Any update on this if it's going to be picked up?

@lyanco
Copy link
Collaborator

lyanco commented Mar 20, 2023

Heya, we are not currently working on this. In the meantime you can use self-deployed collection to do this: https://cloud.google.com/stackdriver/docs/managed-prometheus/setup-unmanaged

@bwplotka
Copy link
Collaborator

bwplotka commented Mar 22, 2023

Another user asked for this today, so it might good idea to prioritise it. I am not yet sure we need Probe CR with this, as technically this could be an opinionated field in PodMonitor as well for simplicity.

Note that probing is technically possible with PodMonitors with something like:

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
(...)
spec:
  selector:
    matchLabels:
       (...)
  endpoints:
  - port: http
    scheme: http
    interval: 60s
    path: "/probe"
    params:
      module:
      - http_2xx
      target:
      - "http://target1/"
    metricRelabeling:
    - action: replace
      replacement: "http://target1/"
      targetLabel: probe_target
      

Two main limitations are:

  1. You have to create separate PodMonitor per target/param (port has to be unique) and no dynamic parameter mangling as described here is possible.
  2. Instance label points to blackbox exporter, not target URL, which might be annoying (as explained here). Cause: You cannot currently relabel instance label (you will get cannot relabel with action "replace" onto protected label "instance"` error). This can be mitigated by metricRelabeling suggested above for custom label.

@pintohutch
Copy link
Collaborator

Interesting. So IIUC the main reason we can't support the probe monitoring use case in a single PodMonitoring is that the job_names are not unique if the ports are not unique within the spec.

However, if we switched to using the index of the endpoint, as prometheus-operator does, rather than the port, then this would be possible.

Aside: I'm curious then, if probe-style monitoring works with prometheus-operator's ServiceMonitor, then what does the Probe custom resource provide? A "nicer" API for probing? AFAICT it's most conventionally used for static target scraping.

@QuerQue
Copy link

QuerQue commented May 17, 2023

Our recommendation in the meantime is to run the GMP Prometheus binary as a sidecar to the blackbox_exporter with a static config that only scrapes the blackbox exporter.

Hi @fabxc I have GMP Prometheus running in non-GKE env and I use GCP cloud monitoring to visualize node_exporter metrics. I also want to use blackbox exporter. I'm not sure if I understand this workaround correctly. Please take a look and let me know if that's what you recommend.

  1. Managed collection that scraps node_exporter leaves as it is.
  2. New deployment blackbox-exporeter with sidecar container and config like below:
  spec:
    ...
    secrets:
    - gmp-test-sa
    containers:
    - name: prometheus
      env:
      - name: GOOGLE_APPLICATION_CREDENTIALS
        value: /gmp/key.json
      volumeMounts:
      - name: secret-gmp-test-sa
        mountPath: /gmp
        readOnly: true

plus prometheus config as configMap with

global:
  external_labels:
    project_id: PROJECT_ID
    location: REGION
    cluster: CLUSTER_NAME

And scraping settings of course

@pintohutch
Copy link
Collaborator

Hi @QuerQue,

  1. Yes. Managed collection can be left as you've currently configured it for node-exporter scraping.
  2. Yes that workaround should work for deploying the GMP collector as a sidecar, provided your scrape config is properly set up.

In addition, it is possible to use PodMonitoring against blackbox-exporter, but it has limitations and is not ideal, as mentioned in the previous comment.

We're thinking of ways to better support this use case in the future and will leave this issue open to track.

Hope that helps!

@dnck
Copy link

dnck commented Dec 7, 2023

I think we would perhaps benefit from a "Probe" monitoring resource as well. Our use-case is something like the following:

  • We run highly available blackbox exporter (replicas = 5) in our gke cluster
  • We can (but don't as will be explained) configure PodMonitoring resources with endpont params and metricRelabeling magic to make managed prometheus scrap the targets behind the bb exporter. So something like,
apiVersion: monitoring.googleapis.com/v1                                                                           
kind: PodMonitoring                                                                                                
metadata:                                                                                                          
  name: foo             
spec:                                                                                                              
  selector:                                                                                                        
    matchLabels:                                                                                                   
      app: blackbox-exporter                                                                                       
  endpoints:                                                                                                       
  - port: 9115                                                                                                     
    path: "/probe"                                                                                                 
    params:                                              
      module:                                            
        - "icmp.ping"
      target:          
        - "1.2.3.4"
    interval: 120s
    timeout: 5s                                          
    metricRelabeling:                                    
      - action: replace
        sourceLabels:  
          - job      
        targetLabel: "datacenter"
        replacement: "garage"   

But, if I understand correctly, each of the N blackbox-exporter replicas would be scrapped, and our target would suffer under the increased demands. For this reason, we run self-deployed managed collectors in our cluster. Perhaps there's a way to do this I'm not aware of, but as of now, I'm of the belief that maybe? a ProbeMonitoring resource (from what it sounds) would help us.

@lyanco
Copy link
Collaborator

lyanco commented Dec 7, 2023

Adding ProbeMonitoring is not looking likely for this upcoming half - but - Cloud Monitoring has uptime checks (including synthetics) that do the same thing, and the resulting time series are queryable using PromQL like anything else in Cloud Monitoring. If you're looking for a fully managed solution, perhaps take a look: https://cloud.google.com/monitoring/uptime-checks/introduction

@dnck
Copy link

dnck commented Dec 12, 2023

Adding ProbeMonitoring is not looking likely for this upcoming half - but - Cloud Monitoring has uptime checks (including synthetics) that do the same thing, and the resulting time series are queryable using PromQL like anything else in Cloud Monitoring. If you're looking for a fully managed solution, perhaps take a look: https://cloud.google.com/monitoring/uptime-checks/introduction

Thanks, I'll take a look at those, but, in general, we already have a robust configuration system for managing prometheus collection with multitarget exporters. Also we like using Monarch as our tsdb. But, it would just be very nice if the gmp crds made multi cluster, multi cloud, monitoring a bit easier, and I think a ProbeMonitor might help with that. If you all are looking for contributors, I'd be happy to take a stab at it.

@pintohutch
Copy link
Collaborator

We absolutely welcome contributors @dnck! We're happy to review or collaborate on any designs

@pintohutch
Copy link
Collaborator

FYI @dnck - we have a PoC that @TheSpiritXIII and @bernot-dev put together in #766.

We'll prioritize rolling this out in the near future as a supported offering. PTAL there if you have any input!

@bernot-dev
Copy link
Collaborator

After some discussion, we recommend using Uptime to meet these needs. Please let us know if you would like to revisit this in the future.

@bernot-dev bernot-dev self-assigned this May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants