Steve Eichert on 1 Aug 2008 07:48:53 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: collective intelligence - bayes theorem help

  • From: "Steve Eichert" <steve.eichert@gmail.com>
  • To: philly-lambda@googlegroups.com
  • Subject: Re: collective intelligence - bayes theorem help
  • Date: Fri, 1 Aug 2008 10:48:44 -0400
  • Authentication-results: mx.google.com; spf=neutral (google.com: 172.21.9.3 is neither permitted nor denied by domain of steve.eichert@gmail.com) smtp.mail=steve.eichert@gmail.com; dkim=pass (test mode) header.i=@gmail.com
  • Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:received:x-sender:x-apparently-to :received:received:received-spf:authentication-results:received :dkim-signature:domainkey-signature:received:received:message-id :date:from:to:subject:in-reply-to:mime-version:content-type :references:reply-to:sender:precedence:x-google-loop:mailing-list :list-id:list-post:list-help:list-unsubscribe:x-beenthere; bh=qZN7NVGKhhPx4E3GFWDFA8BdqsJUUZ3qyJExnTqdwKc=; b=0MWGbm40NIW2PUjAlVq4BrLrIzq7Ev7d/mFcX51mkrCdUQ2AQz4dRgChqP4S60tRWN 9wzTPLGWxKz5VnX5HiDu5OgErQLnMeGqh6es4JPuExg2T4FBN7JWRacs+nd9mSt58FvL CpA4iD+TH0n4TDX431Yjy2SkbOIMblDwzRD4o=
  • Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=FTcg/0YJ25H0V4GduvnAzzpHtpeYHJ7EuqmWH5+H96c=; b=a9ZkEWVyXLJKLlcUuCYN1P0R5qZYQI1Ee26YOpUQUe3tM5+IBxdBZfUaa3RfBJoy4R oEgezBaJMw26YshwvY67lbE5D2RjMLl7/oLAEvLgTQeo+Lxu/r14Q0JlZRs1p5pS6wAW lA/HGlW36vnfozqTAghmcocFsW8smeevIn4NE=
  • Mailing-list: list philly-lambda@googlegroups.com; contact philly-lambda+owner@googlegroups.com
  • Reply-to: philly-lambda@googlegroups.com
  • Sender: philly-lambda@googlegroups.com

I realize they aren't independent, and was just pointing out the problem with the original very simplified approach I was taking, which did assume they were independent and was thus flawed.

On Fri, Aug 1, 2008 at 10:35 AM, Jonathan Tran <jonnytran@gmail.com> wrote:

> Right, and what if in my example, state  doesn't impact the probability at
> all.  If belonging to Philly lambda is the key determining factor then
> taking state into account only throws us out of whack.

Are you saying you flat out _know_ that state doesn't impact the
probability?  ... as in, you're making that simplification?  Or are
you asking, "What if the state doesn't really matter?  How will that
affect the calculation?"

Because I was going to say that, as Gabriel pointed out, state and
group are not necessarily independent.  ...in which case, your
assumption is incorrect.

P(B and C) = P(B)*P(C) if B and C are independent.
But P(B and C) is not generally equal to P(B)*P(C)

(This is true of any B, C, not just yours.)

Similarly,
P(A|B and C) is not generally equal to P(A|B)*P(A|C).

If you want to put attributes together like that, you literally have
to go through your dataset and recompute the probability of following
a person given each combination of attributes, since (I'm guessing)
the attributes are not independent.  Someone might be more likely to
be in Philly Lambda if they live in NJ, as opposed to FL, and so P(B
and C) will be different than P(B)*P(C).  (But then again, maybe you
don't care about people living outside of driving range from Philly
Lambda, so as a simplification, you might say they're independent.)