[ale] Archive web page

Wed Dec 9 15:06:29 EST 2015

Pavuk and httrack are your other options. Httrack will resolve simple
javascript links.

Justin
On Dec 9, 2015 12:28 PM, "DJ-Pfulio" <djpfulio at jdpfu.com> wrote:

> On 12/09/2015 11:20 AM, Chris Fowler wrote:
> > I've been dealing with a kernel bug and want to document the resolution
> with
> > the software tree in SVN. Typically I print a web page to PDF, but I'd
> like
> > to copy the HTML locally instead. I've used wget in the past. Is there a
> > better way?
> >
> > The page I want to store is
> >
> >
> http://askubuntu.com/questions/145965/how-do-i-target-a-specific-driver-for-libata-kernel-parameter-modding
> >
> >
> > I also have some more pages. My goal is to just document the pages I
> > referenced in solving the issue.
> >
>
> PDF should only be used when page layout is mandatory.  If you just need
> the
> information (text + images), then HTML is a much better format.
>
> wget isn't bad. lynx and curl might work too.  The real issue is whether
> you
> need JS or not.  Stripping JS from pages is easy - just use a text/simple
> browser ... dillo comes to mind.
>
> Avoiding JS is smart. Why let remote systems run code on yours?
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20151209/c6ba331f/attachment.html>