[ale] Archive web page

DJ-Pfulio djpfulio at jdpfu.com
Wed Dec 9 12:25:27 EST 2015


On 12/09/2015 11:20 AM, Chris Fowler wrote:
> I've been dealing with a kernel bug and want to document the resolution with
> the software tree in SVN. Typically I print a web page to PDF, but I'd like
> to copy the HTML locally instead. I've used wget in the past. Is there a
> better way?
> 
> The page I want to store is
> 
> http://askubuntu.com/questions/145965/how-do-i-target-a-specific-driver-for-libata-kernel-parameter-modding
> 
> 
> I also have some more pages. My goal is to just document the pages I
> referenced in solving the issue.
> 

PDF should only be used when page layout is mandatory.  If you just need the
information (text + images), then HTML is a much better format.

wget isn't bad. lynx and curl might work too.  The real issue is whether you
need JS or not.  Stripping JS from pages is easy - just use a text/simple
browser ... dillo comes to mind.

Avoiding JS is smart. Why let remote systems run code on yours?


More information about the Ale mailing list