[mirror-admin] where are all the indices of the repository?

Carlos Carvalho carlos at fisica.ufpr.br
Fri Jul 10 08:57:01 EDT 2009


Axel Thimm (Axel.Thimm at ATrpms.net) wrote on 10 July 2009 10:34:
 >On Thu, Jul 09, 2009 at 08:23:07PM -0700, J.H. wrote:
 >> Carlos Carvalho wrote:
 >>>  >Why don't you want to use --delay-updates?
 >>>
 >>> Because of the disk hit. Fedora updates very often involve more than
 >>> 10,000 files, and all these renames in sequence hit the disk hard. A
 >>> few days ago an update of about 12,700 files took about 20min of
 >>> renaming, and another a few days earlier of >20,000 took more than
 >>> 33min.
 >
 >That is just 10 renames per second! There seems to be a filesystem or
 >configuration problem on your server, it should scale much higher than
 >that.

The machine is busy with other things also, so I'm trying to optimize
things.

 >A way around --delay-updates is to have a multi-pass rsync which first
 >transfers rpms only w/o --delete*, then transfers everything w/o
 >--delete* (repodata including rpms to avoid any racing between data
 >and metadata) and finally does a full rsync w/ --delete* options
 >(again full for avoiding racing problems). That's the way I used it
 >before rsync (on my mirror) had the delay options.

A 2-pass is enough, just use delay-updates in the second one. It's
much smaller so won't be a big hit and will be short enough to
minimize incoherences.

 >If you add the disk and network costs on master and slave you will
 >find that --delay-updates is much cheaper than a manual
 >multi-rsync.

True, if rsync is used in the usual way. However our script doesn't do
disk scanning here, and will only make a single scan upstream, so in
our case there's no loss.

If fullfilelist were done properly we wouldn't even scan upstream.

--


More information about the Mirror-admin mailing list