Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CronJob for medusa purge not in the correct namespace #1299

Closed
smutel opened this issue Apr 26, 2024 · 2 comments · Fixed by #1300
Closed

CronJob for medusa purge not in the correct namespace #1299

smutel opened this issue Apr 26, 2024 · 2 comments · Fixed by #1300
Labels
bug Something isn't working done Issues in the state 'done'

Comments

@smutel
Copy link
Contributor

smutel commented Apr 26, 2024

What happened?

  • The k8ssandra operator is installed in the k8ssandra-operator namespace
  • My cluster and the backups are done in the cluster namespace
  • A CronJob is created automatically in the k8ssandra-operator. This cronjob will start a purge process in the operator namespace instead of in the cluster namespace
  • A medusa task is created by this cronjob in the wrong namespace

Did you expect to see something different?

  • The cronjob need to start the purge on the cluster namespace

How to reproduce it (as minimally and precisely as possible):

  • Install k8ssandra on the k8ssandra-operator namespace
  • Create a cluster in another namespace
  • The CronJob is created
  • A medusa task is created in the wrong namespace

Environment

  • K8ssandra Operator version:

    1.15.0

  • Kubernetes version information:

Client Version: v1.29.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
  • Kubernetes cluster kind:

AKS

  • K8ssandra Operator Logs:
2024-04-26T09:20:20.161Z        ERROR   Reconciler error        {"controller": "medusatask", "controllerGroup": "medusa.k8ssandra.io", "controllerKind": "MedusaTask", "MedusaTask": {"name":"purge-backups-20240426092008","namespace":"k8ssandra-operator"}, "namespace": "k8ssandra-operator", "name": "purge-backups-20240426092008", "reconcileID": "2b82f272-8107-4378-a34b-81a0465d9d4e", "error": "CassandraDatacenter.cassandra.datastax.com \"dc1\" not found"}

datacenter dc1 is in the cluster namespace not in k8ssandra-operator

@smutel smutel added the bug Something isn't working label Apr 26, 2024
@adziura-tcloud
Copy link

in addition, I would propose to make this CronJob optional, or at least configurable
this is what we are doing in our current setup

apiVersion: batch/v1
kind: CronJob
metadata:
  name: cassandra-backup-purge
  namespace: cassandra
spec:
  schedule: "30 2 * * *"
  successfulJobsHistoryLimit: 0
  failedJobsHistoryLimit: 1
  startingDeadlineSeconds: 10
  jobTemplate:
    spec:
      backoffLimit: 0
      template:
        metadata:
          labels:
            app: cassandra-backup-purge
        spec:
          restartPolicy: Never
          serviceAccountName: cassandra-backup-purge
          imagePullSecrets:
            - name: container-registries
          containers:
            - name: cassandra-backup-purge
              image: bitnami/kubectl:1.29
              imagePullPolicy: Always
              resources:
                limits:
                  ephemeral-storage: 20Mi
                  memory: 300Mi
                requests:
                  cpu: 100m
                  ephemeral-storage: 1Mi
                  memory: 100Mi
              command:
                - sh
                - -c
                - |
                   # Purge obsolete backups
                   kubectl apply -f - <<EOF
                   apiVersion: medusa.k8ssandra.io/v1alpha1
                   kind: MedusaTask
                   metadata:
                     name: purge-backups-$(date +%Y%m%d%H%M%S)
                     namespace: cassandra
                   spec:
                     cassandraDatacenter: dc1
                     operation: purge
                   EOF
                   # Purge obsolete backup jobs
                   for i in $(kubectl -n cassandra get medusabackupjobs.medusa.k8ssandra.io -o go-template --template '{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{"\n"}}{{end}}' | awk '$2 <= "'$(date -d'now-30 days' -u +"%Y-%m-%dT%H:%M:%SZ")'" { print $1 }'); do kubectl -n cassandra delete medusabackupjobs.medusa.k8ssandra.io ${i}; done
                   # Purge obsolete Medusa tasks
                   for i in $(kubectl -n cassandra get medusatasks.medusa.k8ssandra.io -o go-template --template '{{range .items}}{{.metadata.name}} {{.metadata.creationTimestamp}}{{"\n"}}{{end}}' | awk '$2 <= "'$(date -d'now-30 days' -u +"%Y-%m-%dT%H:%M:%SZ")'" { print $1 }'); do kubectl -n cassandra delete medusatasks.medusa.k8ssandra.io ${i}; done

Key points:

  • specifying CJ time
  • successfulJobsHistoryLimit: 0
  • Purge obsolete (more than 30d) backup jobs as we are using MedusaBackupSchedule for backups
  • Purge obsolete (more than 30d) Medusa tasks

@adejanovski
Copy link
Contributor

Actually the cronjob is in the right namespace, because it needs to reference the service account which exists in the operator namespace.
But the job creates a MedusaTask in the wrong namespace, which prevents it from running properly. It also doesn't take datacenter name overrides into account and won't reference the DC correctly if an override is used (.dc.datacenterName).
This CronJob was obviously a bad idea and we'll shortly create a new API to handle purge schedules.

@adejanovski adejanovski added the done Issues in the state 'done' label Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working done Issues in the state 'done'
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants