[ale] Grabbing a dynamic website automatically?
johncole at mindspring.com
johncole at mindspring.com
Fri Aug 23 00:15:48 EDT 2002
Howdy!
Yes, but the problem is that the website changes everyday as I have to log
into a HTTPS site. Then I have to go through a couple of licks/menus in
order to get the page I need.
Otherwise, this would work.
I did look over what someone else did for doing Cookie based wgets/curl and
with HTTPS but I don't see anywhere where it says anything bout time-access
and logging in and going through a few pages before I get to the content I
need.
Thanks for the ideas though everyone!
Thanks,
John
>At 08:50 AM 08/22/2002 -0400, you wrote:
>>Run a cronjob with Links outputting the page to a text file?
>>
>>Something like: "links -dump https://www.foo.bar/page.pl > ~/daily" done
>>at 0200, perhaps?
>>
>>--
>>Christopher R. Curzio | Quantum materiae materietur marmota monax
>>http://www.accipiter.org | si marmota monax materiam possit materiari?
>>:wq!
>>
>>Thus Spake <johncole at mindspring.com>:
>>Thu, 22 Aug 2002 08:31:36 -0400
>>
>>
>>> Howdy all!
>>>
>>> What would be the best way to grab the data off of a website that is
>>> dynamic, HTTPS, and has cookies enabled.? I'm trying to capture a
>>> single page everyday from a particular website automatically.
>>>
>>> (in particular I'm using Redhat 7.2)
>>>
>>> I need the page back in text format preferably (or I can convert it to
>>> text later as needed for insertion into a database.)
>>>
>>> Thanks,
>>> John
Paypal membership: free
Donation to Freenet: $20
Never having to answer the question "Daddy, where were you when they took
freedom of the press away from the Internet?": Priceless.
http://www.freenetproject.org/index.php?page=donations
---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.
More information about the Ale
mailing list