Backup to AWS S3 Bucket

Backup to AWS S3 Bucket

While this is not an uncommon thing to do, I couldn’t find a straight forward example for both databases and file directories. So of course, I had to write my own (albeit based on a database only script from mittsh). For the TLDR; just go to https://github.com/mikemcmurray/backup-to-s3

It’s written in python using the ubiquitous boto3 and just reads the config and source databases and directories from a JSON configuration file. In probably less than 5min you can have completed your first backup and then just schedule it from then on out.

NOTE: The use of S3 incurs a cost. You are solely responsible for managing the use of that system and any costs incurred.

Installation

Copy the files in this repo or just “git clone” them to the machine that needs backing up. The following will clone the current script into a new folder.

git clone https://github.com/mikemcmurray/backup-to-s3.git backup-to-s3

Change into that new folder and install the libraries listed in the requirements.txt file. i.e. “pip install boto3 argparse –user”

Rename and change the config file to suit your own needs. Run the script manually to make sure your config is working as expected.

If all is good then add it to your crontab to run as often as you like. Each backup file is named with the current timestamp to the second so multiple backups each day can be identified.

Run the backup as below. Full paths defined if you’re putting it into crontab and based on a Ubuntu machine layout. User home is ubuntu in this example as that’s the defualt user name on AWS Ubuntu instances.

/usr/bin/python /home/ubuntu/backup-to-s3/backup-to-s3.py /home/ubuntu/backup-to-s3/backup-to-s3.json

You can use the AWS S3 key values in the config to split different backups up into S3 keys (like folders) based on your server names or client accounts, etc.

S3 and Glacier

If you have a heap of data in S3 it will start to cost you more than a coffee a day to keep there. But AWS offer cheaper, longer-term storage in another product called Glacier. The nice thing about these two products is that the bucket properties in S3 can automatically “age out” files from S3 into Glacier. So then you only keep the very new backups in S3 and the rest end up in Glacier where a few hundred GB only costs you a coffee per month.

Leave a Reply

Your email address will not be published. Required fields are marked *