I posted a reply on the vmware community forums regarding backing up your VMWare Virtual Machines and thought it would be good to do something more thorough.
The first consideration you need to make when planning your backup strategy is to know what you are trying to protect yourself from.
This, for the most, boils down to Hardware failure or Software Corruption.
The simple solution to protect against hardware failure is a RAID 1 or higher setup. This creates, in a RAID 1 setup, a exect mirror of the drive. If one goes down, you can rebuild the mirror off of the still working drive.
This however, does not protect you from software corruption. Because the mirroring process is near real-time (write caching delays, etc.) if your OS is corrupted, chances are both copies on the mirror are going to have the same corruption.
To protect against software corruption you have a variety of choices which all depend on your particular setup and budget. I am going to focus on the likely scenario of a home user with limited funds to purchase hardware/software and happen to be running a linux host. (BTW.. if you are running Windows, you are throwing away a serious chunk of your hardware resources to your host OS.)
Since the files that represent your virtual machines are in use cause your VM is powered on, you can not copy them directly. Doing so would dramatically increase the risk of corruption due to write caching in the VM as well as when you bring your backup online the guest will think it did not have a proper shutdown.
This means you need to have the files NOT in use. Since a good backup is one that runs automatically, a GUI is not going to be a good choice. The good people at VMWare thought of that and provided a console tool to manage VMs called “vmrun”.
vmrun can perform the majority of tasks, if not all, I have not really looked into it too much, that you can from a GUI or the craptastic WebUI of Server 2.
The command you are looking for in particular is:
vmrun -h https://hostname:443/sdk -T server -u user -p "password" suspend /path/to/myvm/myvm.vmx
to suspend it and
vmrun -h https://hostname:443/sdk -T server -u user -p "password" start /path/to/myvm.vmx
to bring it back up. Running this command places your VM in a safe state to copy the files that make up its virtual environment.
NOTE: If your password has special characters in it, be sure to include the double quotes around it.
Next we need to think about where you are going to store these files. Because in this scenario, we are talking about protecting vs. software corruption, it does not necessitate that these be stored on a separate machine. Though, by doing so, you can have a pretty good alternative to using a RAID solution.
If you are going to do a local solution. You simply want to perform a copy.
cp -R /path/to/myvm/* /path/to/backup/dir/
Now, the inclination of some would be to tar up the contents of the myvm directory first. You do not want to do this. Remember, you suspended your VM. To tar or compress the files takes time which means you have to wait longer for your VM to come up. Simlpy copy the files, then issue the vmrun start command to bring your VM back online.
Once you have done that, you can tar to your hearts content on the backup dir. You could do something like this:
tar -jcf /path/to/backup/dir/myvm.backup-$(date +%Y-%m-%d-%H.%M.%S).tar.bz2 /path/to/myvm
This will create a bz2 compressed tar of your entire myvm directory with the date and time of the backup in the file name.
If you were taking the path of storing the file on a separate box, this is the point at which you should copy it. Because we put the date into the file name, we have to be able to pull it back out. Since this will be in a script, you would simply move date into a var and do an expansion for the filename.
BACKUPDATE=`date +%Y-%m-%d-%H.%M.%S'
The tar command would become:
tar -jcf /path/to/backup/dir/myvm.backup-$BACKUPDATE.tar.bz2 /path/to/myvm
Now you can perform the copy. I would recommend you use FTP if you are not crossing a public network, otherwise, use SSH. I say FTP due to the performance differences. In FTP you do not have all of the encryption overhead for the network and processor.
For FTP you would do something like.
ftp -n $HOST <<EOS
quote USER $USER
quote PASS $PASSWD
put myvm.backup-$BACKUPDATE.tar.bz2
quit
EOS
Or for something a little more complex. This allows for error handling.
global FTP
spawn ftp $HOST
set ftp $spawn_id
expect "name"
send "$USER\r"
expect "Password:"
send "$PASSWD\r"
expect "ftp>"
send "put myvm.backup-$BACKUPDATE.tar.bz2\r"
expect "ftp>"
send "quit\r"
SFTP is a much simpler option in terms of syntax.
sftp /path/to/myvm.backup-$BACKUPDATE.tar.bz2 sshuser@host.example.com:/path/to/store/backup
To provide authentication, you simply setup public key authentication for SSH.
From here, you can delete the directory with a simple rm -rfd /path/to/myvm. Just be sure that the compressed copy you just created is not in the directory you are about to delete.
Through out this, you could add in a vairety of checks and error handling to make it much more robust.
— wow this was a lot of thought.. I am currently working on a script to semi-generically do this. Stay tuned.
Backup, commands, FTP, script, Server, SFTP, VMWare