[ale] Two offices, one data pool

Brian Pitts brian at polibyte.com
Thu Feb 17 17:50:52 EST 2011


On 02/17/2011 12:30 PM, Michael B. Trausch wrote:
> On Thu, 2011-02-17 at 12:11 -0500, David Tomaschik wrote:
>> How current do the files need to be?  Would it be okay for the two
>> offices to have local repositories that are just synced frequently
>> with something like unison? 
> 
> That'd be great, if it were possible to do it in a manner which made it
> impossible for people to step on other people's toes.
> 
> The biggest thing is that there are a small handful of
> documents---roughly 500 of them---that have a very high (near 1)
> probability of being opened and edited at the same time in both offices.
> Because the offices' duties overlap significantly, it has to be possible
> to have the same sort of file locking semantics that a single Samba
> server provides to a single office.

Assuming you insist on the model that users have to be able to directly
open a file, no document management system, version control, etc, I
think there are three layers.

1) The software keeping the data in sync between the two servers.

2) The software exporting data stored on the servers to the clients.

3) The software the clients use to work with the data.

Locking at the third layer would be the simplest. Does Microsoft Office
support this? I remember when I supported Adobe Indesign around 5 years
ago, the program created lock files that prevented two users from
simultaneously opening a file. It was a PITA to clean those up when the
software crashed, though.

For the second layer, it seems like you are stuck on the CIFS protocol.
The default Samba locking implementation doesn't do distributed locking.
 So while the locking works fine with only one server, once you have two
they won't be aware of each other's locks and nothing prevents users
from clobbering. However, you can replace this lock implementation with
a clustered version, CTDB. This, a distributed filesystem, and some
samba configuration tweaks will let you have pCIFS.

For the first layer, there are several distributed, fault-tolerant file
systems for linux that could keep the data mirrored and synced in
real-time between the two servers. I hear a lot about glusterfs, but
there are also others like Ceph and MooseFS that you should check out as
well

If it turns out that glusterfs works well for keeping your data
mirrored, you may be able to skip setting up pCIFS with Samba and CTDB.
Glusterfs claims it can export its files using CIFS.

http://wiki.samba.org/index.php/CTDB_Setup
http://ctdb.samba.org/
http://www.gluster.org/
http://europe.gluster.org/community/documentation/index.php/Gluster_3.1_CIFS_Guide

-- 
All the best,
Brian Pitts


More information about the Ale mailing list