<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">

<HTML>

<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

  <META NAME="GENERATOR" CONTENT="GtkHTML/3.24.5">

</HEAD>

<BODY>

One of our government systems stores reports generated. <BR>

<BR>

Those reports are stored outside apache's reach. <BR>

<BR>

A specific hyperlink to an active apache program (in this case Progress Webspeed)&nbsp; results in us spitting back the selected report inline.&nbsp; This affords control of who sees what report.&nbsp;&nbsp; There is no native way of specifying a static URL to get the report file. <BR>

<BR>

And it includes a daily hit counter so we can just cut people off if their access departs from historical trend based on IP or userid. <BR>

<BR>

It's a moderate booger to write the thing that spits back the file inline, since we might have .pdf files, .html files, text files, and I recall we had to do content headers appropriately. <BR>

<BR>

Neal Rhodes<BR>

<BR>

<BR>

On Wed, 2011-01-12 at 11:50 -0600, John Heim wrote: 

<BLOCKQUOTE TYPE=CITE>

<PRE>

All,

I have a problem with an apache web server. The problem is that one of my

users has some large PDF documents available for

download. Every few weeks, our server gets bogged down when someone tries to

download these documents many thousands of times.  They download each 

document only once or twice a second but over and over and over. Eventually, 

our server gets bogged down. The documents are mostly in the 1.5Mb to 2Mb 

range.

I deal with it by blacklisting the IP address of the offending client. Its

always a single IP address. So it can't be a denial of service attack. If it

is, its the lamest DOS attack ever.

Anybody have any idea why this is happening? I have looked for some kind of

loop in the html pages where an automatted client might think it these are

all different documents. I even tried downloading it myself with wget. No

problems.

Any suggestions for preventing this? I thought about forcing people to

register or putting  up a CAPTCHA. But I'd rather not do those things. I'd

rather just prevent a single IP from downloading each document more than

once a day or something like that.

_______________________________________________

Ale mailing list

<A HREF="mailto:Ale@ale.org">Ale@ale.org</A>

<A HREF="http://mail.ale.org/mailman/listinfo/ale">http://mail.ale.org/mailman/listinfo/ale</A>

See JOBS, ANNOUNCE and SCHOOLS lists at

<A HREF="http://mail.ale.org/mailman/listinfo">http://mail.ale.org/mailman/listinfo</A>

</PRE>

</BLOCKQUOTE>

</BODY>

</HTML>