-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent errors logged after enabling telemetry #2018
Comments
Thanks @tomassommareqt. FWIW I have seen the same logs when working on this feature. I don't expect these logs to show up outside of a dev context, though. We'll investigate and fix this. |
still seen in 2.9.0 although the metrics work using metrics writer role
|
Thanks, @rojomisin. We still haven't got to this. I wonder if this is race condition in OpenCensus itself. |
perhaps fixed in OpenTelemetry pkg? https://github.com/open-telemetry/opentelemetry-go-contrib |
Quite possibly. We're currently using OpenCensus given that some internal tooling that uses the Proxy has a big investment in OpenCensus. But we might revisit that decision now that OpenTelemetry's metrics package is stable. |
Bug Description
We are running gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.7.0 as a container next to our main http api container for connectivity to our CloudSQL instance.
After enabling telemetry using the
--telemetry-project
and-telemetry-prefix
flags we have recurrently gotten the following error logged:2023/11/04 13:58:43 Failed to export to Stackdriver: rpc error: code = Internal desc = One or more TimeSeries could not be written: Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[0]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/refresh_success_count{opencensus_task:go-1@<redacted>,cloudsql_instance:<redacted>}; Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[1]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/dial_latency{cloudsql_instance:<redacted>,opencensus_task:go-1@<redacted>}
However when expecting the metrics we can see that it works as expected. So this is mostly causes the issue of polluted logs. But it would also be interesting to understand why this error is reported.
Example code (or command)
Stacktrace
`2023/11/04 13:58:43 Failed to export to Stackdriver: rpc error: code = Internal desc = One or more TimeSeries could not be written: Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[0]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/refresh_success_count{opencensus_task:go-1@<redacted>,cloudsql_instance:<redacted>}; Internal error encountered. Please retry after a few seconds. If internal errors persist, contact support at https://cloud.google.com/support/docs.: global{} timeSeries[1]: custom.googleapis.com/opencensus/<redacted>_cloud_sql_proxy/cloudsqlconn/dial_latency{cloudsql_instance:<redacted>,opencensus_task:go-1@<redacted>}`
Steps to reproduce?
Environment
apiVersion: apps/v1 kind: Deployment metadata: name: <redacted> spec: template: spec: containers: - name: cloudsql-proxy image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.7.0 args: - "--auto-iam-authn" - "--max-sigterm-delay" - "25s" - "--structured-logs" - "--telemetry-project" - "<redacted>" - "--telemetry-prefix" - "<redacted>_cloud_sql_proxy" - "<connection-string-redacted>"
Additional Details
No response
The text was updated successfully, but these errors were encountered: