-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: New replica searches for WAL archive that doesn't exist #4412
Labels
triage
Pending triage
Comments
4 tasks
We are experiencing this same issue. @stevec-skyhawk did you manage to find a solution? |
No, we did not find a solution. We ended up having to create a new cluster through the recovery process. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is there an existing issue already for this bug?
I have read the troubleshooting guide
I am running a supported version of CloudNativePG
Contact Details
[email protected]
Version
1.23.0
What version of Kubernetes are you using?
1.29
What is your Kubernetes environment?
Cloud: Amazon EKS
How did you install the operator?
YAML manifest
What happened?
Status of cluster at time the new replica is requested (scaling up)
First Point of Recoverability: 2024-04-30T14:01:25Z
Working WAL archiving: OK
WALs waiting to be archived: 0
Last Archived WAL: 0000001100000682000000E2 @ 2024-04-30T20:36:13.111958Z
Last Failed WAL: 00000011.history @ 2024-04-30T18:52:16.746355Z
When the replica pod starts up, the logs suggest its looking for a file that doesn't exist yet:
"logger":"wal-restore","msg":"WAL file not found in the recovery object store","logging_pod":"pgv2-dev-11","walName":"00000012.history"
I can also verify that the wal archive does not exist in the pg_wal directory on the primary so it wasn't a flushing issue.
The cnpg status will indicate the replica is in Standby (file based) recovery, but it does not fully synchronize.
Instances status
Name Database Size Current LSN Replication role Status QoS Manager Version
pgv2-dev-1 719 GB 682/E3002900 Primary OK Guaranteed 1.23.0
pgv2-dev-11 719 GB 682/E40000A0 Standby (file based) OK Guaranteed 1.23.0
pgv2-dev-3 719 GB 682/E3002900 Standby (sync) OK Guaranteed 1.23.0
Cluster resource
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: