Restoring data of failed vmstorage instance #6290

hartfordfive · 2024-05-16T12:12:39Z

Let's say you have a scenario where you have a group of 6 vmstorage instances (provisioned via the operator) and a vminsert replication factor of two for stored metrics. If you permanently loose a few vmstorage instance due to a failed disk, what is the process to follow with vmrestore in order to top bringing back the cluster to a healthy state? I've looked through the docs but I'm not quite sure what to do in this situation. Also, at what frequency should vmbackup run in order to have an optimal backup schedule with minimal potential of data loss?

Thanks for your help!

Amper · 2024-05-16T13:46:30Z

Hey @hartfordfive.
Restoring procedure for enterprise version is described in details here: https://docs.victoriametrics.com/vmbackupmanager/#how-to-restore-in-kubernetes

For opensource version you have to do these operations manually, for instance you can add init container with restore script like that:

spec:
  # ...
  vmstorage:
    initContainers:
      name: vmrestore
      image: victoriametrics/vmrestore:<someVersion>
      command: ["sh", "-c"]
      args:
      - |
        #!/bin/sh
        set -e
        set -o pipefail
        # < here are your commands to verify whether it is necessary to perform restoration for this pod >
        if test -f "vmstorage-data/restore_complete.ignore"; then
            echo "Already restored from backup."
            exit 0
        fi
        echo "Start restoring..."
        /vmrestore-prod \
          -src="<your-backups-path>" \
          -storageDataPath="vmstorage-data" # and other parameters if necessary
        touch "vmstorage-data/restore_complete.ignore"
        echo "Restoring successfully completed."

But you'll need a way to determine when and which pod to restore (see comments in the script).

Also, at what frequency should vmbackup run in order to have an optimal backup schedule with minimal potential of data loss?

It very much depends on your circumstances, requirements and infrastructure (RPO, probability and frequency of disc failures, budget for traffic and back-up storage, etc...)

Amper added the question The question issue label May 16, 2024

Amper self-assigned this May 16, 2024

Amper added vmbackup vmrestore labels May 16, 2024

Amper closed this as completed Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restoring data of failed vmstorage instance #6290

Restoring data of failed vmstorage instance #6290

hartfordfive commented May 16, 2024

Amper commented May 16, 2024

Restoring data of failed vmstorage instance #6290

Restoring data of failed vmstorage instance #6290

Comments

hartfordfive commented May 16, 2024

Amper commented May 16, 2024