[mirror-admin] ERROR: chroot failed for fedora-web
Carlos Carvalho
carlos at fisica.ufpr.br
Wed Jan 14 07:51:39 EST 2009
Matt Domsch (Matt_Domsch at dell.com) wrote on 13 January 2009 19:48:
>There are 2 basic problems we have.
>
>1) when a bitflip happens, it can take a whole day before most mirrors
> have picked up the bitflip, even if they have all the content.
>
>2) a "null rsync" - e.g. resyncing when you're already in sync, takes
> 15-20 minutes. This is mostly due to the directory walk + stat()s
> happening on the "upstream" mirrors, for each client connection.
>
>I'd like to solve both.
>
>Lots of ideas were thrown around, both on this list, and at FUDCon.
>They boil down to:
>
>Triggering has both "Push" and "Polling" as methods to know "hey, now
>would be a good time to run rsync". I suspect we'll wind up
>implementing several.
Yes, that's probably necessary since it's not a requirement that
mirrors know before applying.
>Once you've figured out that "now is a good time to run rsync", what
>more can we do to speed things up?
>
>a) various kernel tunables to keep more NFS inodes and directory trees
> in cache on the server.
We do it here and it works. For a single distro a normal machine can
probably keep file list generation times within acceptable bounds.
>b) hack rsyncd to do the directory tree walk + stat()s, and cache it,
> and then use the cache for each client rsync connect. Refresh the
> cache on occasion. This avoids the full tree walk on each client connect.
This is only necessary if you want to be protected from
non-cooperating clients. If you only have nice clients, and you
provide the correct info, the clients will only pull the file and won't
ask for the disk scan. It's harder to have only cooperating clients
but there's an incentive because they too can avoid their own disk
scan. Maybe for the masters and tier 1 it's enough.
>c) have a list of "files changed since
> $(insert-some-time-interval-here)", and use rsync --file-list to
> sync only those files that have changed.
You mean --files-from? Or is --file-list a fedora change?
This method is more complicated and doesn't work well enough because
mirrors are often in different update situations. If you just provide
the fullfilelist they can pull it and determine what they must update
for their situation.
>Jesse eluded to the "fullfilelist" file (part of c) above) he's
>working on, as that is really really simple to implement. It's not a
>full solution, but it's a start. He needs scripts on his side to
>update those files whenever content is changed on the master servers,
The rsync command I mentioned before is enough.
>and we want to distribute useful example scripts for mirror admins to
>run on their side to check that file, compare against the last time
>they downloaded it, to know if anything changed, and if so, rsync
>(either full or a subset).
Yes, that's what we're already doing:
time rsync filelist (from Duke): 1m19s
Wed Jan 14 08:21:21 BRST 2009
time list processing: 2s
Wed Jan 14 08:21:23 BRST 2009
time building --files-from: 1s
Wed Jan 14 08:21:24 BRST 2009
No disk scan is done here. If you provide the rsync list I'll spare
you the scanning on your side too. BTW, download.fedora.redhat.com is
really slow to do it :-(
>If done right, the fullfilelist can be used to know that nothing has
>changed, and using rsync to get that single file means it can be done
>very fast (thus more frequently), and we can avoid most of the "null
>rsyncs" completely.
Yes.
>The "handle the bitflip" problem can also be solved using the rsync
>--file-list mechanism, only the looked-for file would list only the
>dir where the bitflip happens.
Not necessary, the general method works for everything.
>If done well, then standard rsync polling will be just fine again. If
>that doesn't prove viable, then we'll still wind up implementing some
>of the trigger methods.
Pushing has other advantages. If it's optional and you provide at
least two reasonable methods it's worth. At least for the mirrors :-)
Where can I find the gzip man page for fedora? Yes, it's related to
this issue...
--
More information about the Mirror-admin
mailing list