[ale] hmm. yer never too old to trip on Grep Reg Expressions

DJ-Pfulio djpfulio at jdpfu.com
Sun Sep 25 09:12:32 EDT 2016


Good explanation Charles, but something looks funny.   I created some files to test:

 $ touch aaaa bb ccc i

# The tried the regex:
 $ ls -1 |grep "[abc]+"
      <nothing returned>


 $ ls -1 |grep "[abc]"
 aaaa
 bb
 ccc


 $ ls -1 |grep "[abc]*"
 aaaa
 bb
 ccc
 i

# But if we used egrep (or grep -E if you like)
 $ ls -1 |egrep "[abc]+"
 aaaa
 bb
 ccc

Gotta know which regex engine is being used. ;)
In perl, I've used numbers after an [] group to say exactly how many of _those
things_ I needed. That didn't work with grep/egrep. Don't know why, just know it
didn't.

 $ ls -1 |grep "[abc]c"
 ccc
# and
 $ ls -1 |grep "^[abc]c"
 ccc

Should also mention that piping ls output into a grep is just to avoid bash
globbing so grep is really used.  That ls option is a -1 (one), not l (el).

For a few years, I had to create some very nast regex to match patterns in govt
documents ... so we could hyperlink the ToC and Index entries into the document
at the correct page/paragraph. Nothing like experience to teach. About 200 docs
per flight, so lots of variability.  Adobe Type 3 fonts really screwed with our
regexes since they aren't really letters to the computer. ;(

On 09/23/2016 09:01 AM, Charles Shapiro wrote:
> 
> Ah, regex golf.  Try 'def.*buff.*for.*ALTPLAN'  Use "grep -i" to ignore case. 
> Your initial regexp used *file* regex, where "*" means any character any
> length.  In the proper formal dialects, "*" merely means any number of the
> preceding RE, and the "." means any character. Hence, "foo*" in the shell
> matches "fooa","foob", et cetera.  But in regex, it matches only "foo", "fooo",
> "foooooo", et cetera. Watch out for quoting in the shell also; that's why I used
> single-quotes.  Knowing just a few REs can carry you a surprising distance.
>  [abc] matches the single character a,b,and c.  So "[abc]+" matches aaaa, bb, or
> ccc but not i.
> 
> 
> This worked for me on the following file:
> 
> define buffer snort for ALTPLAN
> DEFINE BUFFER BOOF for ALTPLAN
> FOO
> 
> !:/home/cshapiro/Mapping_Contracts/forsythco> grep -i '^def.*buf.*for ALTPLAN'
> foo.txt
> define buffer snort for ALTPLAN
> DEFINE BUFFER BOOF for ALTPLAN
> 
> For extra fnu, try the regex golf site ( http://www.regex.alf.nu/ ).
> 
> -- CHS
> 
> 
> On Thu, Sep 22, 2016 at 8:35 PM, DJ-Pfulio <DJPfulio at jdpfu.com
> <mailto:DJPfulio at jdpfu.com>> wrote:
> 
>     I'd use perl. Trivial to read a file, find the lines matching any
>     complex regex you like, back up 3 lines and print the following 14 lines.
>     Don't forget to handle lines that happen inside the group to be
>     exported. Would be good to show file:linenum:LINE so it is clear -
>     perhaps highlight the actual line with << >> - idunno.
> 
>     I like Leam's regex except the leading ^ and trailing $ - these things
>     don't need to start in col-1 or end of line. Otherwise, probably
>     restrictive enough to minimize unwanted output.
> 
>     On 09/22/2016 07:30 PM, Leam Hall wrote:
>     > Why not "^def*buff*altplan$"? Then grep v out things you don't want.
>     >
>     > On 09/22/16 14:46, Neal Rhodes wrote:
>     >> So, I need to look in about a bazillion source files for variants of
>     >>
>     >>     DEFINE BUFFER SNORT FOR ALTPLAN.
>     >>     Define Buffer Blech for AltPlan.
>     >>     Def    Buff   Blurf for AltPlan.
>     >>     Def Buff Blurf for AltPlan.
>     >>     def buff blurf for altplan.
>     >>     define buff blurf for altplan.
>     >>     define                      buffer                   blorf for
>     >> altplan.
>     >>     define  new shared buffer                   blorf for altplan.
>     >>
>     >> And grap 3 lines before, 10 lines afterwards, source file and  line#.
>     >>
>     >> I was thinking this would to it:
>     >>
>     >>     grep -i -B 3 -A 10 -H -n -r -f buf-grep.inp * > buf.grep.out
>     >>
>     >> Where buf-grep.inp was
>     >>
>     >>     def*buff*for*ALTPLAN
>     >>
>     >>     def*buff*for*ARM
>     >>
>     >>     def*buff*for*ARMNOTE
>     >>
>     >> Alas it is not thus, and the more I study the reg exp notes the more I
>     >> see there error of my ways, and the less I see an expression that would
>     >> work.
>     >>



More information about the Ale mailing list