[ale] Grabbing a dynamic website automatically?

mainwizard at vei.net mainwizard at vei.net
Fri Aug 23 10:50:03 EDT 2002


----- Original Message -----
From: Geoffrey
Sent: 8/23/2002 7:18:36 AM
To: johncole at mindspring.com
Cc: ale at ale.org
Subject: Re: [ale] Grabbing a dynamic website automatically?

> 
> 
> johncole at mindspring.com wrote:
> > Howdy!
> > 
> > Yes, but the problem is that the website changes everyday as I have to log
> > into a HTTPS site. Then I have to go through a couple of licks/menus in
> 
> Man I hate licks/menus, messes up my monitor screen.  Serious 
> suggestions below...
> 
> > order to get the page I need.
> > Otherwise, this would work.
> > 
> > I did look over what someone else did for doing Cookie based wgets/curl and
> > with HTTPS but I don't see anywhere where it says anything bout time-access
> > and logging in and going through a few pages before I get to the content I
> > need.
> 
> Here's what I've done in the past.  When you get to the page that is 
> just before the one you want to print, check the url that calls that 
> page.  It may be that is all you need to call the page directly.  Try 
> saving this full url somewhere, exit your browser and then attempt to 
> open this page with a new browser.  If it fails because you're missing a 
> cookie, then the issue is more complex.
> 
> You can manipulate cookies with both perl and javascript.  The next 
> attempt would be to retain the cookie they place in your cookie file, 
> update the time/date and insert it back into your cookie file prior to 
> attempting to load the page as noted in the previous paragraph.
> 

This will not work if the cookie contains session information, such as a session id. The way to get around this is to have the script log in first and then go directly to the desired page. If the URL contains any session info, you should be able to use perl/javascript to put it into the URL in the proper place.
Ed.


---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
sent to listmaster at ale dot org.






More information about the Ale mailing list