brent timothy saner on 4 Dec 2009 13:34:13 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Self-hosted online backups?

Hash: SHA1

JP Vossen wrote:
> I want to set up a 
> self-hosted online backup service and copy Mom's data to my house and my 
> data to her house.  I want the data to be compressed, encrypted (both in 
> transit and at rest), have multiple copes/versions 
> (daily/weekly/monthly) and to be disk and bandwidth efficient.
> If I have to roll my own, I can and will--eventually.  Meanwhile does 
> anyone know of anything that I can self-host without a lot of DIY?
> Thanks,
> JP

okay. i'm back home now, which means i can post a proper entry (and
could read the entire thread).

let's examine what JP wants:

1.) self-hosted. this seems to be pretty high on the priority list
because he doesn't trust/like the "cloud" (and in my opinion, he has
good reason to. there's plenty of things flawed with the design from a
security standpoint, but that's an entirely different thread...).
so we're going to make the asssumption that A, he has the hardware for
it, and B, he has an off-site location to put it. (indeed, he's got a
sort of dual-redundancy thing planned- a box at his house for his
mother's data and vice versa).

2.) compression. okay, fair enough. many have suggested that he apply
differential backups in addition to/in lieu of compression. good idea
always (makes restoration much simpler and quicker, also gives the
flexibility, depending on implementation, of rolling back to a certain
version of a file)

3.) encryption, which is something that many people argue negates #2..
more on that later.

4.) "snapshotting"- in otherwords, daily/weekly/whatever

5.) disk and bandwidth efficient.

i mentioned in an earlier reply about BoxBackup [1]. let's go through
the list.


1? yep. has a client/server model, which would let him add more clients
to a location if he wishes. he runs a server on each location and a
client on each location, the client for location A looks to the server
at location B and the client for location B looks to the server at
location A. done. next.

2? yep (and yep). it compresses the files (i BELIEVE) on the client
side. the second "yep" is because it ALSO indexes, hashes, etc. and
supports differentials (and allows you to restore specific files).

3? YEP! and it DOES NOT negate #2. taken from their wiki[2]
The files, directories, filenames and file attributes are all encrypted.
By examining the stored files on the server, it is only possible to
determine the approximate sizes of a files and the tree structure of the
disc (not names, just number of files and subdirectories in a
directory). By monitoring the actions performed by a client, it is
possible to determine the frequency and approximate scope of changes to
files and directories.

Stored files are encrypted using AES for file data and Blowfish for
metadata. This does mean that the one thing you do need to back up
off-site and look after is a 1k file containing your keys - the data on
the server is useless without it. But the key never changes once
generated, so that makes looking after it much easier.

The connections between the server and client are encrypted using TLS
(the latest version of SSL). Traffic analysis is possible to some
degree, but limited in usefulness.

An attacker will not be able to recover the backed up data without the
encryption keys. Of course, you won't be able to recover your files
without the keys either, so you must make a conventional, secure, backup
of these keys."

SO all the encryption takes place on the client. in ADDITION, it uses a
ssl auth'd socket:
SSL certificates are used to authenticate clients. UNIX user accounts
are not used to minimise the dependence on the configuration of the
operating system hosting the server.

A script is provided to run the necessary certification authority with
minimal effort."[2]

4? yep. it even has a special "snapshot mode" that's specifically
catered around this functionality.

5? yep. of course, "efficient" is a bit of a relative term, but for what
it does (keeping in mind it meets ALL the other requirements AND THEN
SOME, it uses a remarkably small amount of disk space and bandwidth. the
BIGGEST bottleneck will be the I/O, most likely. and do the first base
backup locally, because that initial backup is gonna be a doozy no
matter what backup method you use).

JP- at least look over the documentation. i have a sneaking suspicion
it'd be something well within your capability to configure/install and
maintain. :)

- -there are client versions for gnu/linux (duh), windows, solaris,
freebsd, etc.
- -it's opensource! yay!
- -it's actively developed (last RC- 0.11RC5- was released in september 09)
- -the server daemon /doesn't even need to run as root/.
- -it supports quotas for backups, rotating old backups, etc.
- - for a list of BB as
compared to some other backup methods listed in this thread
- -DOCUMENTATION! and for just a start


Version: GnuPG v2.0.13 (GNU/Linux)
Comment: Using GnuPG with Mozilla -

Philadelphia Linux Users Group         --
Announcements -
General Discussion  --