Toby DiPasquale on 23 Jul 2007 15:49:15 -0000

Re: [PhillyOnRails] Somewhat OT, Statistics

On Mon, Jul 23, 2007 at 10:25:20AM -0400, Matt Hughes wrote:
> There is an interesting class that Stanford and Google are jointly
> presenting "STAT 202: Statistical Aspects of Data Mining."  The class
> videos are posted here:
> And the class page here:

Yeah, this is part of what I was referring to when I said "anything
interesting" in my last post. Data mining is a specialization of
statistics with some new terminology.

E.g. in marketing, there is a technique that marketers call
"segmentation". This is essentially the act of breaking up your customers
or potential customers by some set of attributes. However, this is
typically done in an ad-hoc manner based on assumptions made by the person
doing the segmenting. A better approach is to use legitimate clustering
techniques to have the data tell you what clusters ("segments") exist and
not have to guess.

Toby DiPasquale
