Backing Up your Data
See Training Module 9 for more detail on snapshots and backups.
It is always a good idea to have a back-up copy of data on your virtual machines. Instance snapshots are a great way to preserve a copy of the software configurations, configurations and profiles of a Virtual machine, but will not back-up the data stored in the secondary ephemeral storage drive or an attached volume. Volume Snapshots are a method of preserving the disk state of a volume in order to relaunch it in the future, but is also not an appropriate data back-up mechanism.
A data back-up strategy should include transferring data to another data storage location, such as your desktop or other research data storage.
The source for the back-up will be the secondary 'ephemeral' storage disk, or from volume block storage.
The destination for the back-up may be your local computer, or a data storage server at your research organisation.
Compressed File Transfers
The simplest back-up is to compress a directory and transfer it from your VM to your back-up destination.
The general comand structure to compress files:
tar -cvpzf <NameOfArchive>.tar.gz <list of your files or folders>
To copy a directory ( /mnt/data ) into a single, compressed file ( data.tar.gz ):
tar -cvpzf data.tar.gz /mnt/data
The compressed file can be transferred to another computer using FileZilla (for a local computer back-up), or SCP to back-up on a remote server (see the Cloud Basics article: Transferring Data)
scp <Path_To_Source_File> <Path_to_Destination>
scp ~/data.tar.gz firstname.lastname@example.org:data/directory
scp -i path/to/key ~/data.tar.gz email@example.com:data/directory
Backing up with RSYNC
This is a utility that creates incremental backups, such that the most recent synced state of a source directory is 'mirrored' in a directory on the destination storage device.
Syncing is a more efficient way of backing up data, as after the first sync, only modifications to the files will be transferred.
- RSYNC must be installed on the source computer (the VM) and the destination computer.
- RSYNC is pre-installed on MacOSX
- On Ubuntu, enter
sudo apt-get install rsync
- On Windows, installation is through the Cygwin package and usage is more complicated.
The general command for creating or syncing the back-up:
rsync -av <source directory> <destination directory>
rsync -av -e -i <path-to-private-key> <source directory> <destination directory>
To sync from the Terminal app. on your local computer, from the directory you are syncing into:
rsync -av ubuntu@NNN.NNN.NNN.NNN:/mnt/data/ dataCopy/ (this syncs data from the VM)
rsync -av dataCopy/ ubuntu@NNN.NNN.NNN.NNN:/mnt/data/ (this will restore data to the VM)
To sync from the VM command line to a remote server:
rsync -av /mnt/data/ firstname.lastname@example.org:data/directory/
and to restore:
rsync -av email@example.com:data/directory/ /mnt/data/
Backing up Volumes to the Object Storage
There are methods to save a copy of the data on a volume to the Nectar Object Storage. These methods vary between the Availability Zones, so please lodge a support ticket for more information that is relevant to your situation.