Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restoring data of failed vmstorage instance #6290

Closed
hartfordfive opened this issue May 16, 2024 · 1 comment
Closed

Restoring data of failed vmstorage instance #6290

hartfordfive opened this issue May 16, 2024 · 1 comment
Assignees
Labels

Comments

@hartfordfive
Copy link

Let's say you have a scenario where you have a group of 6 vmstorage instances (provisioned via the operator) and a vminsert replication factor of two for stored metrics. If you permanently loose a few vmstorage instance due to a failed disk, what is the process to follow with vmrestore in order to top bringing back the cluster to a healthy state? I've looked through the docs but I'm not quite sure what to do in this situation. Also, at what frequency should vmbackup run in order to have an optimal backup schedule with minimal potential of data loss?

Thanks for your help!

@Amper Amper added the question The question issue label May 16, 2024
@Amper Amper self-assigned this May 16, 2024
@Amper
Copy link
Contributor

Amper commented May 16, 2024

Hey @hartfordfive.
Restoring procedure for enterprise version is described in details here: https://docs.victoriametrics.com/vmbackupmanager/#how-to-restore-in-kubernetes

For opensource version you have to do these operations manually, for instance you can add init container with restore script like that:

spec:
  # ...
  vmstorage:
    initContainers:
      name: vmrestore
      image: victoriametrics/vmrestore:<someVersion>
      command: ["sh", "-c"]
      args:
      - |
        #!/bin/sh
        set -e
        set -o pipefail
        # < here are your commands to verify whether it is necessary to perform restoration for this pod >
        if test -f "vmstorage-data/restore_complete.ignore"; then
            echo "Already restored from backup."
            exit 0
        fi
        echo "Start restoring..."
        /vmrestore-prod \
          -src="<your-backups-path>" \
          -storageDataPath="vmstorage-data" # and other parameters if necessary
        touch "vmstorage-data/restore_complete.ignore"
        echo "Restoring successfully completed."

But you'll need a way to determine when and which pod to restore (see comments in the script).

Also, at what frequency should vmbackup run in order to have an optimal backup schedule with minimal potential of data loss?

It very much depends on your circumstances, requirements and infrastructure (RPO, probability and frequency of disc failures, budget for traffic and back-up storage, etc...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants