Backup and Restore

This document explains how to create and restore data backups withVitess. Vitess uses backups for two purposes:

  • Provide a point-in-time backup of the data on a tablet.
  • Bootstrap new tablets in an existing shard.

Prerequisites

Vitess stores data backups on a Backup Storage service, which is apluggable interface.

Currently, we have plugins for:

  • A network-mounted path (e.g. NFS)
  • Google Cloud Storage
  • Amazon S3
  • CephVitess also supports multiple ways to generate data backups. This is called a Backup engine, which is a pluggable interface

Currently, we have plugins for:

  • Builtin: Copy all the database files into specified storage. This is the default.
  • Percona XtrabackupBefore you can back up or restore a tablet, you need to ensure that thetablet is aware of the Backup Storage system and Backup engine that you are using.To do so, use the following command-line flags when starting a vttablet that hasaccess to the location where you are storing backups.
Flags
backup_storage_implementationSpecifies the implementation of the Backup Storage interface touse.Current plugin options available are:- file: NFS or any other filesystem-mounted networkdrive.- gcs: Google Cloud Storage.- s3: Amazon S3.- ceph: Ceph Object Gateway S3 API.
backup_engine_implementationSpecifies the implementation of the Backup Engine touse.Current options available are:- builtin: Copy all the database files into specified storage. This is the default.- xtrabackup: Percona Xtrabackup.
backup_storage_hookIf set, the contents of every file to backup is sent to a hook. Thehook receives the data for each file on stdin. It should echo thetransformed data to stdout. Anything the hook prints to stderr willbe printed in the vttablet logs.Hooks should be located in the vthook subdirectory of theVTROOT directory.The hook receives a -operation write or a-operation read parameter depending on the directionof the data processing. For instance, write would be forencryption, and read would be for decryption.
backup_storage_compressThis flag controls if the backups are compressed by the Vitess code.By default it is set to true. Use-backup_storage_compress=false to disable.This is meant to be used with a -backup_storage_hookhook that already compresses the data, to avoid compressing the datatwice.
file_backup_storage_rootFor the file plugin, this identifies the root directoryfor backups.
gcs_backup_storage_bucketFor the gcs plugin, this identifies thebucketto use.
s3_backup_aws_regionFor the s3 plugin, this identifies the AWS region.
s3_backup_storage_bucketFor the s3 plugin, this identifies the AWS S3bucket.
ceph_backup_storage_configFor the ceph plugin, this identifies the path to a textfile with a JSON object as configuration. The JSON object requires thefollowing keys: accessKey, secretKey,endPoint and useSSL. Bucket name is computedfrom keyspace name and shard name is separated for differentkeyspaces / shards.
restore_from_backupIndicates that, when started with an empty MySQL instance, thetablet should restore the most recent backup from the specifiedstorage plugin.
xtrabackup_root_pathFor the xtrabackup backup engine, directory location of the xtrabackup executable, e.g., /usr/bin
xtrabackup_backup_flagsFor the xtrabackup backup engine, flags to pass to backup command. These should be space separated and will be added to the end of the command
xbstream_restore_flagsFor the xtrabackup backup engine, flags to pass to xbstream command during restore. These should be space separated and will be added to the end of the command. These need to match the ones used for backup e.g. —compress / —decompress, —encrypt / —decrypt
xtrabackup_stream_modeFor the xtrabackup backup engine, which mode to use if streaming, valid values are tar and xbstream. Defaults to tar
xtrabackup_userFor the xtrabackup backup engine, required user that xtrabackup will use to connect to the database server. This user must have all necessary privileges. For details, please refer to xtrabackup documentation.
xtrabackup_stripesFor the xtrabackup backup engine, if greater than 0, use data striping across this many destination files to parallelize data transfer and decompression
xtrabackup_stripe_block_sizeFor the xtrabackup backup engine, size in bytes of each block that gets sent to a given stripe before rotating to the next stripe

Authentication

Note that for the Google Cloud Storage plugin, we currently only supportApplication Default Credentials.It means that access to Cloud Storage is automatically granted by virtue ofthe fact that you’re already running within Google Compute Engine or ContainerEngine.

For this to work, the GCE instances must have been created with the scope that grants read-write access to Cloud Storage. When using Container Engine, you cando this for all the instances it creates by adding —scopes storage-rw to the gcloud container clusters create command.

Creating a backup

Run the following vtctl command to create a backup:

  1. vtctl Backup <tablet-alias>

If the engine is builtin, in response to this command, the designated tablet performs the followingsequence of actions:

  • Switches its type to BACKUP. After this step, the tablet is nolonger used by vtgate to serve any query.

  • Stops replication, get the current replication position (to be saved in thebackup along with the data).

  • Shuts down its mysqld process.

  • Copies the necessary files to the Backup Storage implementation that wasspecified when the tablet was started. Note if this fails, we still keepgoing, so the tablet is not left in an unstable state because of a storagefailure.

  • Restarts mysqld.

  • Restarts replication (with the right semi-sync flags corresponding to itsoriginal type, if applicable).

  • Switches its type back to its original type. After this, it will most likelybe behind on replication, and not used by vtgate for serving until it catchesup.

If the engine is xtrabackup, we do not do any of the above. The tablet cancontinue to serve traffic while the backup is running.

Restoring a backup

When a tablet starts, Vitess checks the value of the-restore_from_backup command-line flag to determine whetherto restore a backup to that tablet.

  • If the flag is present, Vitess tries to restore the most recent backup fromthe Backup Storage system when starting the tablet.
  • If the flag is absent, Vitess does not try to restore a backup to thetablet. This is the equivalent of starting a new tablet in a new shard.As noted in the Prerequisites section, the flag isgenerally enabled all of the time for all of the tablets in a shard.By default, if Vitess cannot find a backup in the Backup Storage system,the tablet will start up empty. This behavior allows you to bootstrap a newshard before any backups exist.

If the -wait_for_backup_interval flag is set to a value greater than zero,the tablet will instead keep checking for a backup to appear at that interval.This can be used to ensure tablets launched concurrently while an initial backupis being seeded for the shard (e.g. uploaded from cold storage or created byanother tablet) will wait until the proper time and then pull the new backupwhen it’s ready.

  1. vttablet ... -backup_storage_implementation=file \
  2. -file_backup_storage_root=/nfs/XXX \
  3. -restore_from_backup

Managing backups

vtctl provides two commands for managing backups:

  • ListBackups displays theexisting backups for a keyspace/shard in chronological order.
  1. vtctl ListBackups <keyspace/shard>
  • RemoveBackup deletes aspecified backup for a keyspace/shard.
  1. RemoveBackup <keyspace/shard> <backup name>

Bootstrapping a new tablet

Bootstrapping a new tablet is almost identical to restoring an existing tablet.The only thing you need to be cautious about is that the tablet specifies itskeyspace, shard and tablet type when it registers itself at the topology.Specifically, make sure that the following additional vttablet parameters are set:

  1. -init_keyspace <keyspace>
  2. -init_shard <shard>
  3. -init_tablet_type replica|rdonly

The bootstrapped tablet will restore the data from the backup and then applychanges, which occurred after the backup, by restarting replication.

Backup Frequency

We recommend to take backups regularly e.g. you should set up a cronjob for it.

To determine the proper frequency for creating backups, considerthe amount of time that you keep replication logs and allow enoughtime to investigate and fix problems in the event that a backupoperation fails.

For example, suppose you typically keep four days of replication logsand you create daily backups. In that case, even if a backup fails,you have at least a couple of days from the time of the failure toinvestigate and fix the problem.

Concurrency

The back-up and restore processes simultaneously copy and eithercompress or decompress multiple files to increase throughput. Youcan control the concurrency using command-line flags:

  • The vtctl Backup command uses the-concurrency flag.
  • vttablet uses the -restore_concurrency flag.If the network link is fast enough, the concurrency matches the CPUusage of the process during the backup or restore process.