Backing up MongoDB to Amazon Glacier/S3 with Python using sh and bakthat

I made a simple Python script using bakthat, a backup framework I created, and sh, a Python subprocess interface, that makes backing up MongoDB to Amazon Glacier or S3 an easier task.

The backup process is extremely simple, here is how the script works:

Restoring is easy as downloading the backup with bakthat and restoring it using mongorestore.

Features

  • Easily switch between S3 or Glacier.
  • Automatically compress (and optionally encrypt) backups.
  • You can easily restore backups yourself.
  • Bakthat stores a custom glacier inventory for you and make it easy to download backups.
  • Grandfather-father-son backup rotation supported

Requirements

First, you need to install bakthat and sh.

$ sudo pip install bakthat sh

Then, you must configure bakthat.

$ bakthat configure

The script

I think the script is pretty self-explanatory, let me know if you have any questions.

import sh
import logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s;%(name)s;%(levelname)s;%(message)s")
from bakthat.helper import BakHelper

# Arguments for mongodump/restore command, leave empty if you want to backup your local mongo
# if you use authentication {"u": "user", "p": "password"}
MONGODUMP_PARAMS = {"u": "user",
                    "p": "password",
                    "oplog": True}

# Leave blank to disable encryption
PASSWORD = "mypassword"

# glacier or s3
DESTINATION = "glacier"

# BakHelper automatically create a temp directory
with BakHelper("mymongobak",
               destination="glacier",
               password=PASSWORD,
               tags=["mongodb"]) as bh:
    logging.info(sh.mongodump(**MONGODUMP_PARAMS))

    bh.backup()
    bh.rotate()

Restore a backup

Juste run bakthat show to list available backups.

$ bakthat show

To download the latest full backup, just run:

$ bakthat restore mymongobak -d glacier

You should also check out bakthat documentation.

Once downloaded, you can restore it with mongorestore command line utility, don't forget the --oplogReplay option.

Miscellaneous

Restoring from glacier

Don't forget that retrieving an archive from Glacier takes 3 to 5 hours.

Not enough space on /tmp partition

If (like me), your VPS have a very small /tmp partition, you can specify an alternative tmp directory by setting the TMP environment variable before running the backup.

$ export TMP=/home/thomas

Bakthat and Python's standard library make use of this environment variable.

Replica set considerations

According to the documentation, backing up a replica set is similar to backing up a single instance.

Conclusion

You can read more on MongoDB backups on the official MongoDB documentation.

You may also want to check out BakManager, an app that monitors your backups and notifies you when a backup doesn't happen (It works well with bakthat).

Please don't hesitate to leave feedback, criticism, or to ask whatever questions you have.

You should follow me on Twitter

Share this article

Tip with Bitcoin

Tip me with Bitcoin and vote for this post!

1FKdaZ75Ck8Bfc3LgQ8cKA8W7B86fzZBe2

Leave a comment

© Thomas Sileo. Powered by Pelican and hosted by DigitalOcean.