-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restoring data of failed vmstorage instance #6290
Comments
Hey @hartfordfive. For opensource version you have to do these operations manually, for instance you can add init container with restore script like that: spec:
# ...
vmstorage:
initContainers:
name: vmrestore
image: victoriametrics/vmrestore:<someVersion>
command: ["sh", "-c"]
args:
- |
#!/bin/sh
set -e
set -o pipefail
# < here are your commands to verify whether it is necessary to perform restoration for this pod >
if test -f "vmstorage-data/restore_complete.ignore"; then
echo "Already restored from backup."
exit 0
fi
echo "Start restoring..."
/vmrestore-prod \
-src="<your-backups-path>" \
-storageDataPath="vmstorage-data" # and other parameters if necessary
touch "vmstorage-data/restore_complete.ignore"
echo "Restoring successfully completed." But you'll need a way to determine when and which pod to restore (see comments in the script).
It very much depends on your circumstances, requirements and infrastructure (RPO, probability and frequency of disc failures, budget for traffic and back-up storage, etc...) |
Let's say you have a scenario where you have a group of 6 vmstorage instances (provisioned via the operator) and a vminsert replication factor of two for stored metrics. If you permanently loose a few vmstorage instance due to a failed disk, what is the process to follow with vmrestore in order to top bringing back the cluster to a healthy state? I've looked through the docs but I'm not quite sure what to do in this situation. Also, at what frequency should vmbackup run in order to have an optimal backup schedule with minimal potential of data loss?
Thanks for your help!
The text was updated successfully, but these errors were encountered: