Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods in the multi-cluster deployment shows - Readiness probe failed: HTTP probe failed with statuscode: 500 #1560

Open
rajarahul12 opened this issue Jan 5, 2023 · 1 comment
Labels
bug Something isn't working needs-triage

Comments

@rajarahul12
Copy link

rajarahul12 commented Jan 5, 2023

Bug Report

Describe the bug
Trying to deploy k8ssandra in a multi-cluster env by following https://docs-v2.k8ssandra.io/install/local/multi-cluster-helm/

  1. Some of the pods in the first data plane cluster are showing Readiness probe failed: HTTP probe failed with statuscode: 500
  2. I don't see statefulset and pods in the second data plane cluster.

Logs from one of the failing pod:

Defaulted container "cassandra" out of: cassandra, server-system-logger, jmx-credentials (init), server-config-init (init)
Starting Management API
Running java -Xms128m -Xmx128m -jar /opt/management-api/datastax-mgmtapi-server-0.1.0-SNAPSHOT.jar --cassandra-socket /tmp/cassandra.sock --host tcp://0.0.0.0:8080 --host file:///tmp/oss-mgmt.sock --explicit-start true --cassandra-home /opt/cassandra
INFO [main] 2023-01-05 15:37:54,594 Cli.java:344 - Cassandra Version 4.0.1
INFO [main] 2023-01-05 15:37:55,051 ResteasyDeploymentImpl.java:657 - RESTEASY002225: Deploying javax.ws.rs.core.Application: class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,055 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.LifecycleResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,056 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.K8OperatorResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,056 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.KeyspaceOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,056 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.KeyspaceOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,057 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.MetadataResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,057 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.NodeOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,057 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.NodeOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,057 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.TableOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,058 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.TableOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,070 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.AuthResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,070 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource io.swagger.v3.jaxrs2.integration.resources.OpenApiResource from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,071 ResteasyDeploymentImpl.java:704 - RESTEASY002210: Adding provider singleton io.swagger.v3.jaxrs2.SwaggerSerializers from Application class com.datastax.mgmtapi.ManagementApplication
Started service on tcp://0.0.0.0:8080
INFO [main] 2023-01-05 15:37:55,531 ResteasyDeploymentImpl.java:657 - RESTEASY002225: Deploying javax.ws.rs.core.Application: class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,531 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.LifecycleResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,532 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.K8OperatorResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,532 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.KeyspaceOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,532 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.KeyspaceOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,532 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.MetadataResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,532 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.NodeOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.NodeOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.TableOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.v1.TableOpsResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource com.datastax.mgmtapi.resources.AuthResources from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:691 - RESTEASY002220: Adding singleton resource io.swagger.v3.jaxrs2.integration.resources.OpenApiResource from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,533 ResteasyDeploymentImpl.java:704 - RESTEASY002210: Adding provider singleton io.swagger.v3.jaxrs2.SwaggerSerializers from Application class com.datastax.mgmtapi.ManagementApplication
INFO [main] 2023-01-05 15:37:55,614 IPCController.java:130 - Starting Server
INFO [main] 2023-01-05 15:37:55,634 IPCController.java:139 - Started Server
Started service on file:///tmp/oss-mgmt.sock
INFO [nioEventLoopGroup-3-1] 2023-01-05 15:38:17,078 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:38:17,098 Cli.java:617 - address=/10.241.128.39:33801 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:38:17,111 Cli.java:617 - address=/10.241.128.39:33813 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-3] 2023-01-05 15:38:26,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:38:26,941 Cli.java:617 - address=/10.241.128.39:2101 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:38:31,939 Cli.java:617 - address=/10.241.128.39:49091 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-5] 2023-01-05 15:38:36,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:38:36,941 Cli.java:617 - address=/10.241.128.39:49093 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:38:46,940 Cli.java:617 - address=/10.241.128.39:47247 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-7] 2023-01-05 15:38:46,941 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:38:46,942 Cli.java:617 - address=/10.241.128.39:47245 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-8] 2023-01-05 15:38:56,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:38:56,940 Cli.java:617 - address=/10.241.128.39:45405 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:01,940 Cli.java:617 - address=/10.241.128.39:60695 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-10] 2023-01-05 15:39:05,849 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:39:05,850 Cli.java:617 - address=/10.241.128.39:60699 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-11] 2023-01-05 15:39:06,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:06,941 Cli.java:617 - address=/10.241.128.39:60701 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:39:16,943 Cli.java:617 - address=/10.241.128.39:26013 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-13] 2023-01-05 15:39:16,943 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:16,944 Cli.java:617 - address=/10.241.128.39:26011 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-14] 2023-01-05 15:39:26,938 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:39:26,940 Cli.java:617 - address=/10.241.128.39:13321 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:31,942 Cli.java:617 - address=/10.241.128.39:46805 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-16] 2023-01-05 15:39:36,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:39:36,940 Cli.java:617 - address=/10.241.128.39:46815 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-2] 2023-01-05 15:39:46,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:46,941 Cli.java:617 - address=/10.241.128.39:22223 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:39:46,942 Cli.java:617 - address=/10.241.128.39:22221 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-3] 2023-01-05 15:39:56,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:39:56,940 Cli.java:617 - address=/10.241.128.39:42103 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:01,942 Cli.java:617 - address=/10.241.128.39:10851 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-5] 2023-01-05 15:40:06,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:40:06,940 Cli.java:617 - address=/10.241.128.39:10853 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-7] 2023-01-05 15:40:16,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:16,941 Cli.java:617 - address=/10.241.128.39:55003 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:40:16,942 Cli.java:617 - address=/10.241.128.39:55001 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-8] 2023-01-05 15:40:26,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:26,940 Cli.java:617 - address=/10.241.128.39:58719 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:40:31,939 Cli.java:617 - address=/10.241.128.39:55537 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-3-10] 2023-01-05 15:40:34,846 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:34,847 Cli.java:617 - address=/10.241.128.39:55553 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-11] 2023-01-05 15:40:36,939 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:40:36,940 Cli.java:617 - address=/10.241.128.39:55569 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-13] 2023-01-05 15:40:46,940 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:46,941 Cli.java:617 - address=/10.241.128.39:24907 url=/api/v0/probes/liveness status=200 OK
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:40:46,941 Cli.java:617 - address=/10.241.128.39:24909 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-3-14] 2023-01-05 15:40:56,938 UnixSocketCQLAccess.java:88 - Cannot create Driver CQLSession as the driver socket has not been created. This should resolve once Cassandra has started and created the socket at /tmp/cassandra.sock
INFO [nioEventLoopGroup-2-1] 2023-01-05 15:40:56,939 Cli.java:617 - address=/10.241.128.39:61605 url=/api/v0/probes/readiness status=500 Internal Server Error
INFO [nioEventLoopGroup-2-2] 2023-01-05 15:41:01,939 Cli.java:617 - address=/10.241.128.39:32769 url=/api/v0/probes/liveness status=200 OK
rahul@Rahuls-MacBook-Pro mult-cluster % kubectl get pods -n k8ssandra-operator

To Reproduce
Steps to reproduce the behavior:
kubectl apply -n k8ssandra-operator -f k8cm1.yml

Here is the k8ssandraCluster yaml that I am using

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: demo
spec:
cassandra:
serverVersion: "4.0.1"
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: ibmc-vpc-block-general-purpose
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
config:
jvmOptions:
heapSize: 512M
networking:
hostNetwork: true
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 1000m
memory: 2Gi
datacenters:
- metadata:
name: dc1
k8sContext: cassandra-testing-vpc/ceng1c7z05391mqecq7g
size: 12
- metadata:
name: dc2
k8sContext: cassandra-testing-vpc-dc2/ceqj6dgz0uptav6shc5g
size: 12
podSecurityContext:
runAsNonRoot: true
runAsUser: 999
runAsGroup: 999
fsGroup: 999
stargate:
size: 1
heapSize: 512M

Expected behavior
I would expect all the pods to be in ready state and I also don't see any pods in the dc2

Here is the output of kubectl describe k8cs demo -n k8ssandra-operator

Status:
Datacenters:
dc1:
Cassandra:
Cassandra Operator Progress: Updating
Conditions:
Last Transition Time: 2023-01-04T22:32:14Z
Message:
Reason:
Status: True
Type: Healthy
Last Server Node Started: 2023-01-05T15:37:49Z
Node Statuses:
demo-dc1-default-sts-0:
Host ID: 62806dde-2190-4b2d-a19f-847f26a94fb8
demo-dc1-default-sts-1:
Host ID: f09e6161-0513-4c38-a3f4-fc7e279d0d0e
demo-dc1-default-sts-2:
Host ID: 48163ad8-a7eb-41f1-b309-0829b91f52c8
demo-dc1-default-sts-3:
Host ID: 1cd3a07c-c20f-4ce9-b396-2c56a9f173f6
demo-dc1-default-sts-4:
Host ID: 04b47b2d-1533-4186-b45f-b59e6fa83bf2
demo-dc1-default-sts-5:
Host ID: de4efe74-2824-4084-bb2e-59642bafe51e
Error: None
Events:

Environment (please complete the following information):

  • Helm charts version info

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
cert-manager cert-manager 1 2023-01-05 03:39:37.997482 +0530 IST deployed cert-manager-v1.10.1 v1.10.1
k8ssandra-operator k8ssandra-operator 1 2023-01-05 03:50:08.788214 +0530 IST deployed k8ssandra-operator-0.39.2 1.4.1

  • Kubernetes version information:

Client Version: v1.25.3
Kustomize Version: v4.5.7
Server Version: v1.24.9+IKS

  • Kubernetes cluster kind:

3 clusters(1 control plane and 2 data plane clusters) were created using IBM Kubernetes Service

@rajarahul12 rajarahul12 added bug Something isn't working needs-triage labels Jan 5, 2023
@adejanovski
Copy link
Contributor

Hi @rajarahul12,

the second datacenter will not be created as long as all the requested pods in the first dc are up.
You'd need to check the server-system-logger container logs for the pods that fail to reach ready state, to see what errors are preventing Cassandra from starting up.

Are all the 12 pods being scheduled with some failing to reach Running status?
Could you provide an output of kubectl get pods ?
Do you see errors in the logs of the server-system-logger container of the pods that fail to fully start?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants