[mirror-admin] Issues with mirror

Adrian Reber adrian at lisas.de
Tue Jul 7 02:58:26 EDT 2015


On Tue, Jul 07, 2015 at 10:16:46AM +1000, Matthew Taylor wrote:
>    Hi all,
>    Looking for some advice here. We have been running our Fedora Linux
>    mirror (fedora.mirror.digitalpacific.com.au) for several months without
>    any problems.. serving anywhere from 3 to 6TB per month, however as of
>    late.. our mirror has been failing the automated crawler checks.
>    We see this in the tail of crawler log:
> 
> 2015-07-06 17:01:34,080 - WARNING - Host 1997 marked not up2date: Crawler timed
> out before completing.  Host is likely overloaded.
> 2015-07-06 17:01:35,373 - INFO - Ending crawl of <Host(1997 - fedora.mirror.digi
> talpacific.com.au)> with status 2
> 
>    Crawler log is here - [1]http://pastebin.com/LtmRLbrd
>    The thing is, the mirror server isn't overloaded. The server has ample
>    resources at hand, disk IO and bandwidth is fine, and we've also tuned
>    Apache too. The crawler sees to come from 209.132.181.102, which is
>    only 155ms~ away (AU<-->US). Our transit and peering services are no
>    where near their capacity.
>    MTR is here - [2]http://pastebin.com/yi0aeP1i  (source: 101.0.101.66).
>    We first saw the mirrormanager in our logs here:
> 
>      209.132.181.102 - - [07/Jul/2015:00:00:18 +1000] "HEAD /linux/
>      HTTP/1.1" 200 - "-" "mirrormanager-crawler/0.1
>      (+http://fedorahosted.org/mirrormanager)"
> 
>    And last touch here:
> 
>      209.132.181.102 - - [07/Jul/2015:03:00:17 +1000] "HEAD
>      /linux/atomic/rawhide/objects/5e/54f962dcddd631e25e230db3928ad398639
>      1c2c91a03001b0760254af19aa4.filez HTTP/1.1" 200 - "-"
>      "mirrormanager-crawler/0.1 (+http://fedorahosted.org/mirrormanager)"
> 
>    Is it normal for the mirror to take 3 hours to fail?
>    We are rsyncing from a tier1 mirror
>    (rsync://mirrors.kernel.org/fedora/), and they're completing without
>    any problems. report_mirror is also functioning fine too.
>    Anyone who is able to shed some light on this.. it would be greatly
>    appreciated.
>    Thanks!

Pretty good and detailed analysis and I think I can provide the missing
details. The crawling timeout for all mirrors is indeed three hours. We
recently increased it from two to three hours.

I had a look at your mirror entry and saw that the only URL you provided
is HTTP. This makes sense from a client perspective as that is the
protocol most clients will use. It is important to remember for all
mirrors that if you also specify a RSYNC URL the crawler will crawl your
mirror using RSYNC. This is, most of the time, much faster than using
HTTP as it requires only a single network connection. HTTP can,
depending on the keep-alive duration, require lots of network
connections.

By testing I discovered that your mirror also supports RSYNC and so I
added a RSYNC URL to your 'Fedora Linux' category and manually started a
run of the crawler. Instead of over three hours the crawler finished
much faster:

INFO:crawler:Hosts(1/1):Threads(1/10):1997:fedora.mirror.digitalpacific.com.au:Ending crawl of <Host(1997 - fedora.mirror.digitalpacific.com.au)> with status 0
INFO:crawler:Hosts(1/1):Threads(0/10):0:master:0 of 1 hosts failed
INFO:crawler:Hosts(1/1):Threads(0/10):0:master:Crawler finished after 493 seconds

		Adrian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://mail.ale.org/pipermail/mirror-admin/attachments/20150707/b45de708/attachment.sig>
-------------- next part --------------
--


More information about the Mirror-admin mailing list