[mirror-admin] Issues with mirror
Adrian Reber
adrian at lisas.de
Tue Jul 7 02:58:26 EDT 2015
On Tue, Jul 07, 2015 at 10:16:46AM +1000, Matthew Taylor wrote:
> Hi all,
> Looking for some advice here. We have been running our Fedora Linux
> mirror (fedora.mirror.digitalpacific.com.au) for several months without
> any problems.. serving anywhere from 3 to 6TB per month, however as of
> late.. our mirror has been failing the automated crawler checks.
> We see this in the tail of crawler log:
>
> 2015-07-06 17:01:34,080 - WARNING - Host 1997 marked not up2date: Crawler timed
> out before completing. Host is likely overloaded.
> 2015-07-06 17:01:35,373 - INFO - Ending crawl of <Host(1997 - fedora.mirror.digi
> talpacific.com.au)> with status 2
>
> Crawler log is here - [1]http://pastebin.com/LtmRLbrd
> The thing is, the mirror server isn't overloaded. The server has ample
> resources at hand, disk IO and bandwidth is fine, and we've also tuned
> Apache too. The crawler sees to come from 209.132.181.102, which is
> only 155ms~ away (AU<-->US). Our transit and peering services are no
> where near their capacity.
> MTR is here - [2]http://pastebin.com/yi0aeP1i (source: 101.0.101.66).
> We first saw the mirrormanager in our logs here:
>
> 209.132.181.102 - - [07/Jul/2015:00:00:18 +1000] "HEAD /linux/
> HTTP/1.1" 200 - "-" "mirrormanager-crawler/0.1
> (+http://fedorahosted.org/mirrormanager)"
>
> And last touch here:
>
> 209.132.181.102 - - [07/Jul/2015:03:00:17 +1000] "HEAD
> /linux/atomic/rawhide/objects/5e/54f962dcddd631e25e230db3928ad398639
> 1c2c91a03001b0760254af19aa4.filez HTTP/1.1" 200 - "-"
> "mirrormanager-crawler/0.1 (+http://fedorahosted.org/mirrormanager)"
>
> Is it normal for the mirror to take 3 hours to fail?
> We are rsyncing from a tier1 mirror
> (rsync://mirrors.kernel.org/fedora/), and they're completing without
> any problems. report_mirror is also functioning fine too.
> Anyone who is able to shed some light on this.. it would be greatly
> appreciated.
> Thanks!
Pretty good and detailed analysis and I think I can provide the missing
details. The crawling timeout for all mirrors is indeed three hours. We
recently increased it from two to three hours.
I had a look at your mirror entry and saw that the only URL you provided
is HTTP. This makes sense from a client perspective as that is the
protocol most clients will use. It is important to remember for all
mirrors that if you also specify a RSYNC URL the crawler will crawl your
mirror using RSYNC. This is, most of the time, much faster than using
HTTP as it requires only a single network connection. HTTP can,
depending on the keep-alive duration, require lots of network
connections.
By testing I discovered that your mirror also supports RSYNC and so I
added a RSYNC URL to your 'Fedora Linux' category and manually started a
run of the crawler. Instead of over three hours the crawler finished
much faster:
INFO:crawler:Hosts(1/1):Threads(1/10):1997:fedora.mirror.digitalpacific.com.au:Ending crawl of <Host(1997 - fedora.mirror.digitalpacific.com.au)> with status 0
INFO:crawler:Hosts(1/1):Threads(0/10):0:master:0 of 1 hosts failed
INFO:crawler:Hosts(1/1):Threads(0/10):0:master:Crawler finished after 493 seconds
Adrian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 811 bytes
Desc: not available
URL: <http://mail.ale.org/pipermail/mirror-admin/attachments/20150707/b45de708/attachment.sig>
-------------- next part --------------
--
More information about the Mirror-admin
mailing list