[ale] article feedback...
David S. Jackson
dsj at sylvester.dsj.net
Tue May 27 14:36:53 EDT 2003
Hi,
If you have time, would you take a look at this article for a
sanity check please? I'd appreciate it. In particular, I've
transformed it from latex and dvi, and some of the tools munge
metacharacters pretty badly. I've tried to catch everything, but
in case I haven't, please beware. :-)
TIA!
PS. I was just taking a last quick look and found that ~ and \
characters got munged. Pipes also got munged, so there might be
a few of those left that I missed...
--
David S. Jackson dsj at dsj.net
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Life is divided into the horrible and the miserable.
-- Woody Allen, "Annie Hall"
Mail Help
Sometimes reading and managing mail can take a lot of time. If you're
using a good mail user agent that can take advantage of external programs
on a UNIX/Linux host, there's quite a bit of power you can employ to help
you manage your mail.
You probably already use some sort of procmail-based spam filter, such
as Spamassassin (www.spamassassin.org) or junkfilter (junkfilter.zer0.org).
If so, you probably have to collect your spam into a spam folder, which you
then have to sift through for any friends who accidentally got filtered into
your spam bucket. I recommend using a "whitelist" and a "blacklist"
approach to your procmail filters, all in addition to your other spam filters.
You can incorporate this idea into your .procmailrc thusly:
### snip of sample .procmailrc ###
## Other important variables can be set above here...
## (See man procmailex for examples...)
PMDIR=$HOME/procmail
LOGDIR=$PMDIR/log
## Check senders against your "whitelist"
:0
* ? formail -x"From" -x"From:" | egrep -is -f $PMDIR/friends.txt
$HOME/inbox/personal
## Checksenders against your "blacklist" of known spammers
:0
* ? formail -x"From" -x"From:" | egrep -is -f $PMDIR/spammers.txt
/dev/null
Usually you'll have to experiment where in your .procmailrc to put all
your recipes. I normally put all the important recipes up front and put the
spam rules at the end of the file. The recipes above should probably be
near the beginning of your .procmailrc file, because they can actually
dispatch quite a lot of your mail early on, especially the important mail
from your friends.
Your "friends" and "spammers" files can simply be in
"user at domain.com" format, or you can use regular expressions, such as
".*@.*.kr", which will dump anything from Korea. (Has anyone ever got
legitimate mail from a .kr domain?) There are lots of domains you can put
in this list, such as ".*@bonusoffers.com", ".*valuerewards.com" and so
on.
Handy Utilities. Now that you're deleting a fair amount of mail before
you ever see it, you have to have some method of seeing what mail has been
1
deleted, just in case you accidentally delete legitimate mail unintentionally.
Assuming you have a $HOME/bin directory in your $PATH, you can put
little helper scripts there which you can call from your mail client. To see
what mail has been deleted, you can make a macro to call this little one
liner:
tac "/procmail/log|grep -A1 dev/null|less
I simply put this in my .muttrc, since I use Mutt (a terrific MUA, by
the way. See www.mutt.org.). If you're using Mutt, you can bind macros to
certain keystrokes. For example, all my macros start with the plus sign:
"+". They always use two letters that have some sort of significance to
what the macro does. "+vm" tells Mutt to run Vi (my favorite editor) on
my .muttrc file; "+so" tells Mutt to "source" any new changes I've made to
my .muttrc file to make them active. Mutt syntax would be:
macro generic +lp "!tac "/procmail/log|grep -A1 dev/null|less"n"
You could bind the macro to a more elaborate script by using this in your
.muttrc:
macro generic +ld "!"/bin/showdeleted.sh"n" #Look at deleted mail
The script "showdeleted.sh" could contain additional lines to further inform
you of what procmail has deleted for you.
Sometimes you will find a bunch of new spam has been added to your
spam folder, and you will want to add those addresses to your blacklist. I
wrote a small script called "getspamaddr.sh" which collects address of all
mail in $HOME/inbox/spam and writes it to a temporary file in
$HOME/tmp/spammers.txt. The script compares each address in the
"From" field against addresses already in $HOME/procmail/spammers.txt
and $HOME/procmail/friends.txt to ensure that I don't add duplicates.
### snip of getspamaddr.sh ###
#!/bin/sh
# Static values: adjust as needed
newcount=0
regex="^From:"
spamfile=$HOME/inbox/spam
tmp_file=$TMP/tmp_addresses.txt
friends=$HOME/procmail/friends.txt
spammers=$HOME/procmail/spammers.txt
testfile=$TMP/spammers.txt # for testing only...
# Find new addresses in spam folder...
tail -n50000 ${spamfile} | grep ${regex} | \
sed 's/\(From: \).* [<]*\([^ ].*\@.*\.*\)[>]*$/\2/g' | \
sed 's/[<>]//g' | \
sed 's/\[mailto://g' | \
2
sed 's/\]//g' | \
sed 's/From: //g' | \
sed 's/^root root$//g' | \
sed 's/^.*=\([a-zA-Z0-9]*\@.*\..*\).*$/\1/g' | \
sort|uniq > $tmp_file
# See if address already exists in my database...
cat $tmp_file | while read address; do
if `grep -qi "$address" $friends` || `grep -qi "$address" \
$spammers` ; then
echo "$address" already exists in database...
else
echo "$address" >> $testfile # testing only
newcount=$((newcount+1))
echo $newcount
fi
done
# Output a summary of results...
echo "====================================================="
echo " Summary"
echo "====================================================="
echo
echo "$newcount entries out of $(wc -l $spammers| \
cut -c-8|sed 's/ //g') total entries"
echo
echo = = = = = = = = = = = = = = = = = = = = = = = = =
echo Any items appearing below represent duplicates
echo in your database:
echo = = = = = = = = = = = = = = = = = = = = = = = = =
sort < $testfile | uniq -c | grep -v "^ *1" | more
echo
echo End of duplicate listing...
echo
echo
Note that the check of an address already existing in your spammers or
friends file is probably too simplistic. For example, the check doesn't take
into account any regular expression symbols. So you could have a
".*@emailfraud.com" entry, but you'll still get the address
"abuser at emailfraud.com" listed in your summary. But output like this still
can be useful, because it tells you that something was wrong with your
procmail recipe in the first place. Abuser at emailfraud.com should have been
deleted and should not have made it to your spam folder anyway. So,
duplicates in your address list can point to a problem in your procmail
recipes too, which can be helpful.
Still, you can try sorting your spammer addresses from the right side of
the '@' symbol by using this command to see which domains are sending
3
you the most spam:
sort -t @ -k 2.1 "/tmp/spammers.txt _ less
Sorting by domain will also tell you whether you have duplicate
blacklist entries from the same domain, which might better be blacklisted
with a blacklist entry for the entire domain. In other words, if you have
bob at abuser.com as well as jill at abuser.com as entries in your temporary
spammer file, perhaps it would be worthwhile to simply enter
.*@abuser.com in your blacklist instead of the individual entries. Your shell
script will have a little less work to do then.
Closing notes. These tips are just the beginning. Once you get in the
habit of calling external scripts or macros, you'll think of lots of other uses
for them to help you in your efforts to streamline your email reading. For
example, I created a dedicated mail server in my office that does almost
nothing except fetch mail from my various accounts and sift through it for
junkmail. I get a lot of mail, so the little CPU load is normally quite
pegged.
4
More information about the Ale
mailing list