JP Vossen on 12 Dec 2010 15:06:42 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] 3-way data mirror? |
Date: Sun, 12 Dec 2010 00:52:00 -0500 From: JP Vossen<jp@jpsdomain.org> I have a situation where I have 2 labs that need to keep ~150G of data in sync, but they can't talk to each other. They can talk to a 3rd machine, which is also backed up, so... What I think I'd like to do is have: Lab1<--> COLO<--> Lab2 The problem is that lab1 and lab2 are both read-write, so there is a real possibility of stepping on changes. COLO is read-only. I'd try to just relay through the COLO, but that's also where the backup will happen. And I can't get the FW rules changed. Any better ideas? Any suggestions on tools? I'd strongly prefer something already in the CentOS-5 or EPEL repos.
Stuff I forgot:1) I have only limited control over the COLO server. I can probably get something in a repo installed, more than that is iffy. This is an important server used for other things, and I'm only allowed to use it because it's there and has connectivity and space. 2) The COLO server is 32-bit RHEL5, the 2 Lab servers are 64-bit CentOS-5.5 and CentOS-5.4 (I can upgrade that one).
3) AFAIK, the only connectivity is SSH and I can't change that. 4) The WAN links are very slow, relative to, say, FiOS.5) The data is a mix of large (DC & DVD ISOs) and small (docs, configs, RPMs, etc.).
I also thought about DRDB, but I'm pretty sure IT would shoot that down for the COLO server. And I'm not sure if/how that would work 3-way.
Date: Sun, 12 Dec 2010 01:02:59 -0500 From: Doug Stewart<zamoose@gmail.com> Have you looked into Unison? http://www.cis.upenn.edu/~bcpierce/unison/
I use Unison to sync my laptop & server when traveling. I considered it now, but a) forgot to mention that and b) dismissed it because I need something non-interactive. (I always use the GUI when I use it.)
But now that you mention it, I think it can run scripted over SSH, which is what I need to do. I'll need to re-read the docs on this and figure out haw it handles conflicts if running non-interactively.
And 'unison227' is in EPEL for CentOS-5. :-)
Date: Sun, 12 Dec 2010 03:33:29 -0500 From: Brian Stempin <brian.stempin@gmail.com> Perhaps something like this? http://fak3r.com/2009/09/14/howto-build-your-own-open-source-dropbox-clone/
Interesting. Needs more thought. I *think* I can still see changes getting stepped on if I change something on Lab1 and someone else changes the same thing on Lab2.
Date: Sun, 12 Dec 2010 10:54:31 -0500 From: "K.S. Bhaskar" <bhaskar@bhaskars.com> Might an OpenAFS file system (http://openafs.org) be an option? Perhaps with SELinux used to disable updates at COLO that are not replicated from Lab1 and Lab2?
Tricker. One of the things I forgot to note was that I have limited control over the COLO server, and the connections are probably limited to SSH only. Also, the WAN links are relatively slow.
Date: Sun, 12 Dec 2010 11:51:23 -0500 From: "Gavin W. Burris" <bug@sas.upenn.edu> I would consider using revision control, like SVN, if your data files aren't too large, especially if the raw data doesn't change much. The problem is that changes in binary files do not benefit from diffs, with each change requiring a complete upload/sync of the file. http://subversion.apache.org/
Yup, I use CVS, SVN and BZR in various places. (I require CVS at work, but you can nest BZR under CVS for local revisions and just "publish" to CVS when done. A tad clunky, but it works. :)
However, much of the data is large binary files (CD and DVD ISOs).
Another option that would apply to future, more data intensive collaboration, would be iRODS. This is a more comprehensive solution that builds a "data grid" for distributed projects. https://www.irods.org/index.php/What_is_iRODS%3F
Wow, that's pretty cool. It sounds like overkill, and I may not be able to do it due to the above SSH and limited control issues, but I'll need to look deeper.
Thanks, JP ----------------------------|:::======|------------------------------- JP Vossen, CISSP |:::======| http://bashcookbook.com/ My Account, My Opinions |=========| http://www.jpsdomain.org/ ----------------------------|=========|------------------------------- "Microsoft Tax" = the additional hardware & yearly fees for the add-on software required to protect Windows from its own poorly designed and implemented self, while the overhead incidentally flattens Moore's Law. ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug