Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stargate doesn't start and it remains in CrashLoopBackoff state #1624

Open
AlexsandroRotundo opened this issue Oct 5, 2023 · 1 comment
Open
Labels
bug Something isn't working needs-triage

Comments

@AlexsandroRotundo
Copy link

Bug Report

Describe the bug
I'm trying to install k8ssandra-operator with a server version of Cassandra 4.1.x in a kubernetes cluster with 1.26 version. All the pods start correctly k8ssandra-operator, k8ssandra-operator-cass-operator and cassandra nodes except stargate which after many restarts it remains in a CrashLoopBackoff state.

To Reproduce
Steps to reproduce the behavior:

  1. Helm install k8ssandra-operator version 1.9.0
  2. Apply of the CRD file
  3. Wait for the up&running state of all the pods.
  4. All the pods are running except for stargate

Expected behavior
All the pods are running

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):
k8ssandra-operator v 1.9.0
Cassandra v 4.1.x
Server k8s 1.26.2
Client k8s v 1.25.0
helm v 3.10

  • Helm charts user-supplied values
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: test
spec:
  cassandra:
    serverVersion: "4.1.0"
    storageConfig:
      cassandraDataVolumeClaimSpec:
        storageClassName: cassandra-storageclass
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
    config:
      jvmOptions:
        heapSize: 384Mi
#    networking:
#      hostNetwork: true
    datacenters:
      - metadata:
          name: dc1
#        k8sContext: kind-k8ssandra-0
        size: 2
    mgmtAPIHeap: 64Mi
  stargate:
    size: 1

The error experienced is:

ERROR: Bundle io.stargate.db.cassandra_4_0 [1] EventDispatcher: Error during dispatch. (io.stargate.core.activator.ServiceStartException: Unable to start persistence-cassandra-4.0)
io.stargate.core.activator.ServiceStartException: Unable to start persistence-cassandra-4.0
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:187)
        at io.stargate.core.activator.BaseActivator.access$900(BaseActivator.java:42)
        at io.stargate.core.activator.BaseActivator$Tracker.startIfAllRegistered(BaseActivator.java:230)
        at io.stargate.core.activator.BaseActivator$Tracker.addingService(BaseActivator.java:206)
        at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:943)
        at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:871)
        at org.osgi.util.tracker.AbstractTracked.trackAdding(AbstractTracked.java:256)
        at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:229)
        at org.osgi.util.tracker.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:903)
        at org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
        at org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
        at org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
        at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
        at org.apache.felix.framework.Felix.registerService(Felix.java:3804)
        at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:328)
        at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:302)
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:180)
        at io.stargate.core.activator.BaseActivator.start(BaseActivator.java:103)
        at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:698)
        at org.apache.felix.framework.Felix.activateBundle(Felix.java:2402)
        at org.apache.felix.framework.Felix.startBundle(Felix.java:2308)
        at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
        at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:984)
        at io.stargate.starter.Starter.start(Starter.java:457)
        at io.stargate.starter.Starter.cli(Starter.java:649)
        at io.stargate.starter.Starter.main(Starter.java:690)
Caused by: java.lang.RuntimeException: Unable to gossip with any peers
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650)
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:786)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:731)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420)
        at org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:657)
        at io.stargate.db.cassandra.impl.Cassandra40Persistence.initializePersistence(Cassandra40Persistence.java:182)
        at io.stargate.db.cassandra.impl.Cassandra40Persistence.initializePersistence(Cassandra40Persistence.java:91)
        at io.stargate.db.datastore.common.AbstractCassandraPersistence.initialize(AbstractCassandraPersistence.java:106)
        at io.stargate.db.cassandra.Cassandra40PersistenceActivator.createService(Cassandra40PersistenceActivator.java:158)
        at io.stargate.core.activator.BaseActivator.createServices(BaseActivator.java:243)
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:175)
        ... 25 more
INFO  [main] 2023-10-03 14:32:21,475 BaseActivator.java:178 - Registering core services as io.stargate.core.metrics.api.MetricsScraper
INFO  [main] 2023-10-03 14:32:21,476 BaseActivator.java:178 - Registering core services as com.codahale.metrics.health.HealthCheckRegistry
INFO  [main] 2023-10-03 14:32:21,477 BaseActivator.java:178 - Registering core services as io.stargate.core.metrics.api.HttpMetricsTagProvider
INFO  [main] 2023-10-03 14:32:21,477 BaseActivator.java:185 - Started core services
Starting bundle io.stargate.cql
Detected service startup failure in bundle io.stargate.db.cassandra_4_0: io.stargate.core.activator.ServiceStartException: Unable to start persistence-cassandra-4.0
io.stargate.core.activator.ServiceStartException: Unable to start persistence-cassandra-4.0
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:187)
        at io.stargate.core.activator.BaseActivator.access$900(BaseActivator.java:42)
        at io.stargate.core.activator.BaseActivator$Tracker.startIfAllRegistered(BaseActivator.java:230)
        at io.stargate.core.activator.BaseActivator$Tracker.addingService(BaseActivator.java:206)
        at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:943)
        at org.osgi.util.tracker.ServiceTracker$Tracked.customizerAdding(ServiceTracker.java:871)
        at org.osgi.util.tracker.AbstractTracked.trackAdding(AbstractTracked.java:256)
        at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:229)
        at org.osgi.util.tracker.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:903)
        at org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
        at org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
        at org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
        at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
        at org.apache.felix.framework.Felix.registerService(Felix.java:3804)
        at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:328)
        at org.apache.felix.framework.BundleContextImpl.registerService(BundleContextImpl.java:302)
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:180)
        at io.stargate.core.activator.BaseActivator.start(BaseActivator.java:103)
        at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:698)
        at org.apache.felix.framework.Felix.activateBundle(Felix.java:2402)
        at org.apache.felix.framework.Felix.startBundle(Felix.java:2308)
        at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
        at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:984)
        at io.stargate.starter.Starter.start(Starter.java:457)
        at io.stargate.starter.Starter.cli(Starter.java:649)
        at io.stargate.starter.Starter.main(Starter.java:690)
Caused by: java.lang.RuntimeException: Unable to gossip with any peers
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650)
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:786)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:731)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420)
        at org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:657)
        at io.stargate.db.cassandra.impl.Cassandra40Persistence.initializePersistence(Cassandra40Persistence.java:182)
        at io.stargate.db.cassandra.impl.Cassandra40Persistence.initializePersistence(Cassandra40Persistence.java:91)
        at io.stargate.db.datastore.common.AbstractCassandraPersistence.initialize(AbstractCassandraPersistence.java:106)
        at io.stargate.db.cassandra.Cassandra40PersistenceActivator.createService(Cassandra40PersistenceActivator.java:158)
        at io.stargate.core.activator.BaseActivator.createServices(BaseActivator.java:243)
        at io.stargate.core.activator.BaseActivator.startServiceInternal(BaseActivator.java:175)
        ... 25 more

The problem occurs on all the 4.1.x Cassandra server version with k8ssandra-operator 1.9.0 and 1.8.1. Instead, with 4.0.x Cassandra server version all works fine.

@AlexsandroRotundo AlexsandroRotundo added bug Something isn't working needs-triage labels Oct 5, 2023
@AlexsandroRotundo
Copy link
Author

I suppose this bug could be related to stargate/stargate#2311.
I read only few minutes ago this sentence in the Release notes of k8ssandra 1.9.0
"At the time of this release, Stargate is not yet compatible with Apache Cassandra 4.1. See this issue for more details."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage
Projects
None yet
Development

No branches or pull requests

1 participant