Monday, June 10, 2013

Backing up from Plesk to S3

Recently I went looking for a solution for backing up from a Plesk server to S3: what I settled on was surprisingly simple.

I started with a simple list of criteria, but as I went looking for a solution, and as I continued to find no good ones, my list got longer. I have a tendency to be quite OK with the bare bones if I'm going to be using an existing system, but if I have to build it myself, I'm normally happy to add more features.

I started with basically "I want to be able to backup files and databases", but the solution I ended up with also gave me the following features:
- backup multiple domains
- along with files and databases, backup the actual domain configuration and mail if required
- rotate backups automatically
- define the frequency of backups and the number of backups to keep before rotating
- open source
- a simple interface right in Plesk

So what did I do?

I don't know, really, if this is a super smart way to do it, or just a cop-out, but basically I realised that hey, the Plesk backup manager already let's us do all of the above... except for the S3 part. All that I ended up doing was installing s3cmd from http://s3tools.org and setting it up to do the syncing to S3, looking at the location on the server that Plesk puts it backups.

So basically, users (or I) define backup rules for each domain as needed (via the Plesk UI) and then s3cmd runs with the sync option once a day.

With s3cmd located in my /root/cli-tools directory, and assuming s3://example.com is the name of the bucket I will use, the actual cron tab entry I use is as simple as:
cd /var/lib/psa/dumps; /root/cli-tools/s3/s3cmd -c /root/cli-tools/s3/s3cfg --delete-removed -H --no-progress sync domains s3://example.com/backups

UPDATE: as per a comment by Rutger below, you may actually want to use:
cd /var/lib/psa/dumps; /root/cli-tools/s3/s3cmd -c /root/cli-tools/s3/s3cfg --delete-removed -H --no-progress sync clients s3://example.com/backups
instead of, or possibly in association with, the above line.

When I commission a new Plesk server, I just copy the s3cmd directory over, create a new bucket and I'm done.

The only downsides I see, really, are that if I wanted to just have a single rule for all of the domains, I couldn't. Also, I'm assuming that all of the backups have been run when the cron job runs once a day. Not that that matters too much, as I could just bump the cron job up to hourly if I liked and I wouldn't see much difference.

I think the biggest negative to this approach is that I'm pushing backups explicitly even if what has been backed up actually hasn't changed. That is to say, if Plesk does a backup everyday, then I push a new backup every day... even if nothing has changed since the last backup.

Anyway, I hope this helps someone as for me it was completely obvious once I realised it, but it took me an embarrassingly long time to get to it.

7 comments:

  1. Looks pretty straight forward. Before I give it a shot, did you have the opportunity to do a restore yet? Was Plesk happy with the files you put back in the /var/lib/psa/dumps/ directory and was it able to do a Plesk-based restore from that file?

    ReplyDelete
    Replies
    1. To be honest, I haven't tried restoring it yet. I know that's a horrible way to manage my server but that's life. Like I say, though, those backups are the backups that are built by Plesk itself, so I would be quite surprised if there was a problem.

      If I ever get a chance to check, I will... unless you reply before then :)

      Thanks for the comment, though, as it is something that should definitely be asked and answered.

      Delete
  2. Great post, just what I was looking for!

    I wanted to create a backup of all clients on the machine, so I set the local backup repository on the system wide settings. Then I used the s3cmd to sync the 'clients' folder in stead of the 'domains' folder.

    You will miss out on some .xml files Plesk creates in the 'dump' folder, but it does give you solid database, email and httpdocs backups.

    ReplyDelete
    Replies
    1. Awesome, thanks for the tip Rutger. I haven't worked with the Plesk stuff for a while now, so maybe you can tell us: what's the difference between the 'clients' directory and the 'domains' directory?

      Delete
    2. Sure thing Lincoln.

      The /domains folder only contain the backups you do manually on each domain. It's not used in the system-wide backup as far as I can see.

      The /clients folder contains the domains for each client, e.g. /var/lib/psa/dumps/clients/johnsmith/domains/mydomain.com

      One thing I discovered recently, is when you enable clients to manage their own backups via Plesk, all backups that were made via the system-wide scheduled task are available on domain level as well. I set the retention to two weeks, so for the last 14 days my clients can restore on their own. It uses XML files.
      After those two weeks I think restoring is a manual job.

      Delete
    3. Awesome, thanks for the extra information Rutger. I've added a comment in the original post.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete