[ale] sed regexp question
    Wandered Inn 
    esoteric at denali.atlnet.com
       
    Tue Jul 10 18:47:36 EDT 2001
    
    
  
Christopher Bergeron wrote:
> 
> That would only get websites that start with www;  I can't predict all the
> possible names that might arise.  i do know that the url is always encoded
> in a page as:
> 
> <A HREF="http://xxx.pornsite.com/pictures1.html/">
# assumes one url per line
grep -i 'href=' |awk -F'"' '{print $2}'
If you know the 'HREF' will be all caps, you can do it faster with:
awk -F '"' '/HREF=/ {print $2}'
> 
> so, all I need to do is take everything between the "http:// and the ">
> 
> any suggestions?
> 
> would SED or GREP be better suited for this, and even better, what is the
> way to do it?!
> 
> thanks again for all the leads...
> 
> Christopher Bergeron
> Systems Administrator
> Full Line Distributors
> (770) 416-4237
> mis at fullline.com
> 
> > -----Original Message-----
> > From: I. Herman [mailto:izzmo at mediaone.net]
> > Sent: Tuesday, July 10, 2001 1:41 PM
> > To: Christopher Bergeron
> > Subject: Re: [ale] sed regexp question
> >
> >
> > what's the html file?  You can try:
> >
> > cat whatever.html | grep http | grep www
> >
> > or something like that...not sure what you are trying to do...i'm not
> > familiar w/ sed
> >
> >
> >
> 
> --
> To unsubscribe: mail majordomo at ale.org with "unsubscribe ale" in message body.
--
Until later: Geoffrey		esoteric at denali.atlnet.com
"Great spirits have always found violent opposition from mediocre minds.
The latter cannot understand it when a man does not thoughtlessly submit
to hereditary prejudices but honestly and courageously uses his
intelligence." - Albert Einstein
--
To unsubscribe: mail majordomo at ale.org with "unsubscribe ale" in message body.
    
    
More information about the Ale
mailing list