[mirror-admin] Having to throttle back rsync on download servers

Thu Mar 6 15:09:05 EST 2014

On 5 March 2014 11:16, Chris Schanzle <schanzle at nist.gov> wrote:

> On 02/26/2014 11:11 AM, Stephen John Smoogen wrote:
>
>> My attempts at shared iscsi read-only storage were a while ago but ended
>> up with some amazingly corrupted data.
>>
>
> I realized later that shared read-only block devices can work only if
> *all* users of the filesystem are read-only.  If you had a tight window of
> upstream modifications, in theory it would be possible to have the
> read-only clients unmount the filesystem (flushing caches), server performs
> updates & remounts read-only, then clients remount, but that is obviously
> too disruptive for what should be a continuously operational download
> server.
>
>
>
>  That is what I am seeing. Basically we were allowing 25 rsyncs per host
>> which was working pretty well but recently we now have 3 rsyncs which get
>> data and 22 ones which are slowly working through lstat data.
>>
> ...
>
>  After that it is more about trying to get NFS client to cache more of the
>> metadata if possible.
>>
>
> Have you tried cranking acreg(min,max) way up to cache file attributes
> much longer?  I'm thinking a drastic change from the default one minute (or
> less) to at least 15 to 30 minutes, perhaps an hour or more.  Keeping
> acdir(min,max) at low defaults will allow rsync to find new directory
> entries, which rsync clients will use to find new files.  However, cache
> attributes for too long and rsyncd may (unverified) give stale stat() info
> on files modified (not created) on an update push, such as
> repodata/repomd.xml.
>
> Since NFS uses the Slab to store cached inode data from NFS servers,
> having enough RAM and tuning vm.vfs_cache_pressure down to ensure caching
> all possible NFS entries, lest they get pushed out before they expire or if
> the system prefers to cache file data over precious inode data.
>
>
I have increased the actimeo to 600 and added the nocto option to the
mounts. We will see how this works and if there are problems back off in
either the timings or remove the nocto option.

>
> If increasing attribute caching can't work, the only other solution is
> local storage.  While you claim this isn't suitable due to high disk
> failure rates, I can only guess you got a bad batch of drives or you
> pummeled them into an early grave by having too little RAM and/or didn't
> tune vfs_cache_pressure to reduce seeking for inode lookups.  You could
> continue to use the NFS storage as your 'master' back-end storage on each
> download server and use rsync to keep the local storage current.  Perhaps
> with a little crafty intelligence, if local storage falls over, you could
> revert to offering the NFS storage to rsyncd.

Yeah we don't have the funding to purchase that much local storage. We
would need about 15 TB per system (x5 systems) but our budgets are pretty
constrained for a while.

-- 
Stephen J Smoogen.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/mirror-admin/attachments/20140306/735fd736/attachment.html>
-------------- next part --------------
--