Jeff Abrahamson on Wed, 19 Mar 2003 09:21:06 -0500 |
On Wed, Mar 19, 2003 at 07:09:21AM -0500, Tobias DiPasquale wrote: > [19 lines, 152 words, 830 characters] Top characters: etoanrsh > > On Wed, 2003-03-19 at 02:09, sean finney wrote: > > On Tue, Mar 18, 2003 at 11:14:39PM -0500, Jeff Abrahamson wrote: > > > words, 141 characters). It's attached, for your amusement. I think > > > it's cool to know just how many words you wrote me when I read your > > > email. I'm trying to think of more interesting analyses to do. > > > > if you want to go really crazy with it, how about an analysis for word > > frequency? you could keep word counts of this list in some kind of > > giant histogram, and then have a program that tries to guess the > > sender from the content of his/her message :) > > What you've described is naive Bayes. Its in use in programs like > POPFile and Bogofilter already, bogofilter having built-in hooks into > mutt currently. Although, before receiving Sean's mail, I did add the "top eight characters" header. It's not useful, but it amuses me. Note that the results of auto-view actions show up in quoted replies. This has highlighted for me people who don't trim their quoting, because '>' becomes the most common character. -- Jeff Jeff Abrahamson <http://www.purple.com/jeff/> GPG fingerprint: 1A1A BA95 D082 A558 A276 63C6 16BF 8C4C 0D1D AE4B Attachment:
pgpPxUOC2Gisj.pgp
|
|