[mirror-admin] where are all the indices of the repository?
Carlos Carvalho
carlos at fisica.ufpr.br
Thu Jul 9 23:45:01 EDT 2009
J.H. (warthog19 at eaglescrag.net) wrote on 9 July 2009 20:23:
>Carlos Carvalho wrote:
>> Jon Stanley (jonstanley at gmail.com) wrote on 9 July 2009 19:16:
>> >On Thu, Jul 9, 2009 at 6:03 PM, Carlos Carvalho<carlos at fisica.ufpr.br> wrote:
>> >
>> >> mirror in the meantime. I suppose the indices are in */repodata but this
>> >> is too important to just guess, so I'd like to have an answer from
>> >> those who are well versed in the repository architecture.
>> >
>> >That's correct, but the chance for breakage that you have there (if
>> >you use --delete when syncing the packages) is that a package that is
>> >referenced by your repodata may not actually be available on your
>> >mirror - i.e. when a package is updated, the old one gets deleted from
>> >the master.
>>
>> The package sync won't delete anything, as has been discussed here on
>> May 6, when someone complained that a mirror didn't use delay-updates.
>> [note: Matt then mentioned repodata, I just wanted to be sure there
>> isn't anything else]
>>
>> >Why don't you want to use --delay-updates?
>>
>> Because of the disk hit. Fedora updates very often involve more than
>> 10,000 files, and all these renames in sequence hit the disk hard. A
>> few days ago an update of about 12,700 files took about 20min of
>> renaming, and another a few days earlier of >20,000 took more than
>> 33min. During these periods the number of transactions in the disks was
>> around 98% of the maximum. Distributing the renames during the much
>> longer download time avoids these peaks.
>
>If it helps any, kernel.org doesn't use --delay-updates and really I've
>never heard much in the way of complaints or issues with this.
rsync pulls files in sort order, so repodata comes before many
packages. If you pull fast the time interval between repodata and all
the following is short and the probability of mismatch is small. But
if it takes longer, or there's a lot yet to pull after repodata, it
may become a problem. Given the number of client updates, even a small
fraction of misses becomes a big number over time, and users will
complain.
>I would agree with the spike of disk activity, and how that can be
>very bad as well.
>
>- John 'Warthog9' Hawley
--
More information about the Mirror-admin
mailing list