[ale] Need help with a regular expression please....
Keith Morris
graphicsguy at charter.net
Mon Jul 28 22:18:47 EDT 2003
Thanks so much for all your help...
I actually didn't ever get the darn thing working the way I would have
like. What I ended up doing that worked was to actually replace the
hyperlinked text with placeholders, link with new keywords, then replace
the placeholders with original links...Not super efficient, but it
works.
Thanks again.
Keith
On Mon, 2003-07-28 at 22:11, John Marasco wrote:
> IIRC grouping and character classes don't mix. Those character classes
> expressions are trying to match one of the listed characters and they aren't
> there. Depending on the format of your document, this simple alternate
> might work:
>
> [^>]$keyword[^<]
>
> or a little more fancy solution
>
> [^>](<(b)>)?$keyword(<\(b)>)?[^<]
>
> you can filter out whatever sorts of format tags you want and still look for
> a keyword sans the bounding outer tag. Regular Expressions aren't good at
> searching for missing patterns though.
>
> -----Original Message-----
> From: ale-admin at ale.org [mailto:ale-admin at ale.org]On Behalf Of David
> Corbin
> Sent: Monday, July 28, 2003 4:16 PM
> To: ale at ale.org
> Subject: Re: [ale] Need help with a regular expression please....
>
>
> Keith Morris wrote:
>
> > Hi all! I'm creating a specialized mini CMS in PHP that will store
> > content in a MySQL database. What I am trying to do is parse the
> > content and replace certain keywords with a link. The keywords and
> > associated links are kept in a MySQL table.
> >
> > Here is an example.
> >
> > $keyword = "Widgets Technology Co.";
> > $location = "http://www.widgets.com/about";
> >
> > $keyword2 = "Widgets";
> > $location2 = "http://www.widgets.com";
> >
> >
> >
> > $content = "We have the best Widgets at Widgets Technology Co.";
> >
> > I want to parse through $content looking for $keyword and replacing it
> > with:
> > $keyword (this I can do with no problem)
> >
> > but I am going to be looping through a series of keywords (phrases)
> > sorted by length (longest to shortest) that may or may not contain
> > other defined keywords such as the values above for $keyword2 which
> > would cause nested links and other nonsense.
> >
> > so what I'm needing is a regular expression that will find the
> > $keyword (phrase) that is not already between "<a href =" and "</a>"
> > so that it will not try to relink it.
> >
> > so far, this is the regular expression that I have, but does not work
> > properly:
> >
> > [^(^\<a href=)][^(\>)]($keyword)[^(a\>)$]
>
> This expression is no-where near what you want. You should look into
> zero-length lookahead/behind patterns. But regardless, I don't think
> you're going to find a reasonable RegEx for handling this.
>
> David
>
> >
>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
--
Keith Morris <graphicsguy at charter.net>
_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale
More information about the Ale
mailing list