Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] New metrics to track Cloudwatch API requests and response errors #1329

Open
1 task done
rtkwlf-nachiketrao opened this issue Mar 4, 2024 · 1 comment
Open
1 task done
Labels
enhancement New feature or request

Comments

@rtkwlf-nachiketrao
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Feature description

Feature request:

Metric to count the number of requests made to cloudwatch API (irrespective of response) and count request errors if no valid data is returned.
OR
Include such a metric as part of yace_cloudwatch_requests_total and yace_cloudwatch_request_errors.

Reason:

No way to detect case where the IAM authentication fails or Cloudformation stack is misconfigured. A request is made, but there is no response. There seems to be no counter for this.

How this became an issue:

While testing for correctness of YACE Cloudwatch API requests using yace_cloudwatch_requests_total and yace_cloudwatch_request_errors:

  • Intentionally removed IAM from service account for YACE pod
  • service was denied, but yace_cloudwatch_requests_total doesn't increment - seems like it counts the number of "successful responses to requests" and not "number of requests irrespective of responses received".

Having metrics to count total requests submitted and response errors will help detect such edge cases.

What might the configuration look like?

No response

Anything else?

promutil.CloudwatchAPICounter.Inc()

^ similar counter, but should be incremented before a request is made to the API. and counter to check for request errors

@rtkwlf-nachiketrao rtkwlf-nachiketrao added the enhancement New feature or request label Mar 4, 2024
@kgeckhart
Copy link
Contributor

👋 #1338 isn't exactly what you are asking for but I think solves the problem you are having. Every CloudWatch API call should now either increment the success or error metric depending on the result.

Auth errors should also be present in the logs as an indicator something is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants