Setting up an Automated Backup in CentOS 7 Using Duplicity and Google Drive

Published on 10th May, 2015

We all should be backing up our data, why not do it simply?

I recently posted about Digital Ocean and their fantastic hosting services.

They give you the option to create automated backups and snapshots of your Droplets (the difference is explained here). However, they charge for backups. Now, granted it's incredibly affordable. From their site:

"The pricing for backups is 20% of the cost of the virtual server. So if you want to enable backups for a $5/mo virtual server, the cost for backups will be $1/mo."

But me, being a poor kid, thought that surely I could set this up myself. After a bit of digging I've implemented a fairly solid and super simple backup procedure.

This tutorial assumes you're running CentOS 7. You should be able to Google equivalent explanations for other flavors of Linux.

[~]# cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)

To start with, make sure you have python installed

[~]# python -V
Python 2.7.5

If it's not installed go ahead and get that sorted out.

Then install Duplicity

[~]# sudo yum install duplicity

Alright. One more step to set things up. Since I'm using Google Drive as my backup source, I need to include the Google Data APIs Python scripts. I'll lay out what I did here, but this article is what I followed.

First, see if Python has the needed ElementTree dependency

[~]# python
>>> from xml.etree import ElementTree
If you get no errors, lovely. Move on. Otherwise you'll need to install that dependency by downloading it from here and following the instructions in the above article.

Alright, now to get the Google Python stuff set up. First grab the tar and extract it, then navigate on in.

[~]# wget https://gdata-python-client.googlecode.com/files/gdata-2.0.18.tar.gz
[~]# tar zxvf gdata-2.0.18.tar.gz
[~]# chmod -R 777 gdata-2.0.18
[~]# cd gdata-2.0.18

I had to open up permissions on the folder before I could do the next bit, so that is shown above. Then just run the install script

[~]# ./setup.py install

That will do its thing. Then run the tests to make sure all is well.

[~]# ./tests/rundatatests.py

If that all looks good then congrats. You're ready to start doing some backups.

Duplicity is very easy to use. You can backup your web directory in a single line like so:

[~]# PASSPHRASE=MYPASSPHRASE FTPPASSWORD=GOOGLEACCOUNTPASSWORD duplicity -v8 /my/backup/directory gdocs://MYACCOUNT@MYGOOGLEDOMAIN.com/my/destination

Boom! That's all it takes! Let me break it down a bit:

  • PASSPHRASE - This is the passphrase used to encrypt the backups/archives that duplicity generates
  • FTP_PASSWORD - This, in this case, is your Google Account password
  • MYACCOUNT@MYGOOGLEDOMAIN.com - normally you would use a gmail.com account but it will work if you have a google for business account as well.

Duplicity then encrypts things all to pieces and dumps the archives on your drive. Now, you won't be able to access or manipulate these archives. That's not the goal. The end-result is that you'd use duplicity (and your passphrase) to restore the backup. The encryption ensures that nothing is tampered with. Duplicity also keeps track of changes, so only files that have been edited will get backed up. This makes sure your archive size is efficient.

I threw the above one-liner in a cron tab to run every other hour and now I have a nice backup in place.

Quite nice.

A Note on MySQL

I use this to backup my web directory, but I also wanted to use it to backup all my MySQL databases. First, I created a .my.cnf file in my home directory with the following:

[mysqldump]
user=USER
password=PASSWORD

Then I added this line to my cron file

01 * * * * mysqldump --all-databases --skip-lock-tables > /my/duplicity/directory/alldb.sql

This means that at the top of every hour, mysqldump will run using the credentials in my .cnf file and dump all the tables to a file in my duplicity directory. Then duplicity will include that file in its backups. It might be better to also have Duplicity store the database file somewhere else just in case we want to restore ONLY the databases. The mysqldump program has a lot of options that can let you dump specific databases or even tables so you can be more granular with your backups. You can see more options for mysqldump here.

Best of luck and feel free to hit me up with any questions!

This article is my 24th oldest. It is 719 words long

comments powered by Disqus