Paul.L.Snyder on Wed, 19 Mar 2003 11:00:17 -0500


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] mutt text/* auto-view hack


On 19 March, 2003, "Jeff Abrahamson" <jeff@purple.com> wrote:

>>> On Tue, Mar 18, 2003 at 11:14:39PM -0500, Jeff Abrahamson wrote:
>>>> words, 141 characters). It's attached, for your amusement. I think
>>>> it's cool to know just how many words you wrote me when I read your
>>>> email. I'm trying to think of more interesting analyses to do.
[...]
> Although, before receiving Sean's mail, I did add the "top eight
> characters" header. It's not useful, but it amuses me. Note that the
> results of auto-view actions show up in quoted replies.
>
> This has highlighted for me people who don't trim their quoting,
> because '>' becomes the most common character.

You could add something like

  grep -v '^>\|[[:alnum:][:punct:]]\+:[[:space:]]\|^[[:space:]]*$'

into the pipeline just before the wc.  This will give you a better
idea of how much the person actually wrote, instead of simply how big
the message is with all the quotes.  The regex strips out lines that
begin with > (hopefully getting most quoted lines, unless someone's
using a non-standard quote character), header lines (well, this could
be tweaked a bit, as it allows non-US-ASCII characters in header
field names - the RFC allows only ASCII 33-126, inclusive; this will
also strip any line in the message body that has a first word ending
in a colon, which is undesirable), and all empty lines.  A perl
one-liner might be better:

  perl -ne'if(/^\s*$/){$b=1;next}if($b){print unless /^>/}'

This skips until it finds the first empty line (which should bypass
all headers), skips any empty lines or lines beginning with >, and
passes along all others.  If you want to include blank lines in the
line count, remove the 'next' from inside the braces. I'm sure
there's an awk-ward way of doing the same thing.

(Hmm...I am assuming that mutt is passing along some headers -
haven't had a chance to try this out yet.  If there are no headers,
things are much easier.)

  grep -v '^>\|^[[:space:]]*$'

Cool stuff.

Paul


_________________________________________________________________________
Philadelphia Linux Users Group        --       http://www.phillylinux.org
Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce
General Discussion  --   http://lists.netisland.net/mailman/listinfo/plug