Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated snapshots #2

Open
3 tasks
marcusbooyah opened this issue Dec 17, 2023 · 0 comments
Open
3 tasks

Automated snapshots #2

marcusbooyah opened this issue Dec 17, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@marcusbooyah
Copy link
Contributor

marcusbooyah commented Dec 17, 2023

There should be CRs for taking cluster snapshots and storing them in external storage (s3/minio/azure blob/etc).

This will probably be three resources:

  • QdrantSnapshotSchedule - this will manage creating QdrantSnapshot resources.
  • QdrantSnapshot - this will manage snapshots
  • QdrantRestore - this will restore a cluster from a snapshot

Snapshot files could be very large, so there needs to be a way to upload them from the source without wasting bandwidth. To do this we will create a snapshot-manager service which we will deploy as a Kubernetes Job. The snapshot-manager service should be able to handle both backup and restore operations.

Snapshot steps:

  1. Set QdrantCluster status to CreatingSnapshot - the QdrantCluster controller should make sure to not change cluster state when this is set.
  2. Save the cluster state to the external storage. We need this to be able to restore from a snapshot.
{
    "nodes": 5,
    "version": "1.8.1",
    "resources": {
        "requests": {
            "cpu": "100m",
            "memory": "128mb"
        },
        "limits": {
            "cpu": "500m",
            "memory": "512mb"
        }
    }
}
  1. Create snapshots on each of the cluster nodes using the headless service.
  2. Deploy snapshot-manager as a Kubernetes Job to upload the snapshot to external storage. The Job will need to mount the same volume as where the snapshot is stored.
  3. Once the upload is complete, set QdrantCluster status to Ready

Restore steps:

  1. Set QdrantCluster status to Restoring - the QdrantCluster controller should make sure to not change cluster state when this is set.
  2. Save the current cluster state to external storage.
  3. Get the snapshot cluster state from the external storage and return the cluster to the same state.
  4. Deploy snapshot-manager as a Kubernetes Job to download the snapshot from external storage and restore the snapshot. This should be done for each node in the cluster.
  5. Get the pre-restore cluster state from external storage and return the cluster back to it's original state.
  6. Once the restore is complete, set QdrantCluster status to Ready
apiVersion: qdrant.io/v1alpha1
kind: QdrantSnapshot
metadata:
  name: my-backup
  namespace: qdrant
spec:
  cluster: my-cluster
  collection: my-collection
  s3:
    provider: aws
    accessKey:
      name: my-k8s-secret
      key: accessKey
    secretAccessKey:
      name: my-k8s-secret
      key: secretAccessKey
    region: us-west-1
    bucket: my-bucket
    prefix: my-backup-folder
apiVersion: qdrant.io/v1alpha1
kind: QdrantSnapshotSchedule
metadata:
  name: my-backup
  namespace: qdrant
spec:
  schedule: 5 4 * * *
  pause: false
  snapshot:
    cluster: my-cluster
    collection: my-collection
    s3:
      provider: aws
      accessKey:
        name: my-k8s-secret
        key: accessKey
      secretAccessKey:
        name: my-k8s-secret
        key: secretAccessKey
      region: us-west-1
      bucket: my-bucket
      prefix: my-backup-folder
apiVersion: qdrant.io/v1alpha1
kind: QdrantSnapshotRestore
metadata:
  name: my-backup
  namespace: qdrant
spec:
  cluster: my-cluster
  collection: my-collection
  snapshotId: 870185a1-57aa-45c3-98f2-7c83ac0bbe76
  s3:
    provider: aws
    accessKey:
      name: my-k8s-secret
      key: accessKey
    secretAccessKey:
      name: my-k8s-secret
      key: secretAccessKey
    region: us-west-1
    bucket: my-bucket
    prefix: my-backup-folder
@marcusbooyah marcusbooyah added the enhancement New feature or request label Dec 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

1 participant