Skip to content
This repository has been archived by the owner on Apr 19, 2024. It is now read-only.

Peer list update bug in K8s cluster #189

Open
MAXEE998 opened this issue Sep 12, 2023 · 3 comments
Open

Peer list update bug in K8s cluster #189

MAXEE998 opened this issue Sep 12, 2023 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@MAXEE998
Copy link

We ran a three-replica gubernator setup in our k8s cluster. When one pod was shut down gracefully by K8s, another pod (not all, just one) kept reporting

level=error msg="Error in client.GetPeerRateLimits" batchTimeout=500ms category=gubernator error="rpc error: code = DeadlineExceeded desc = context deadline exceeded" queueLen=2

in the log.

Apparently, it didn't update its peer list accordingly. What may be the cause of this problem?

@MAXEE998
Copy link
Author

The problematic pod keeps trying to get rate limits from the shutdown peer according to the log:

rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 100.103.255.29:81: i/o timeout"

@thrawn01
Copy link
Contributor

I don't run a k8s cluster, so I really don't have a way to test this. I rely on the community to provide support for k8s.

@thrawn01 thrawn01 added the help wanted Extra attention is needed label Sep 25, 2023
@miparnisari
Copy link
Contributor

miparnisari commented Oct 26, 2023

FYI, this isn't limited to k8s. We run on ECS and see something similar. These logs seem to coincide with our deployments.

time="2023-10-25T23:47:56Z" 
level=error msg="error sending global hits to '10.0.37.143:9990'" 
category=gubernator 
error="Error in client.GetPeerRateLimits: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.0.37.143:9990: connect: connection refused\""

I need to do some research on my end to see if it's a bug on our service or on this library.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants