yegg on 1 Aug 2008 06:14:13 -0700 |
It doesn't impact the probability at all. In your case, the answer is 100%. Everyone in your sample follows Toby. So no other factors matter. Take NJ. Everyone who has both NJ and Lambda in your sample (Aaron) follows Toby. Once you know they are in lambda, that's the end of it. I can understand why you don't want to, but I really think it helps to try to write out the equations and, in so doing, define your universe explicitly. In my original email, I defined it as the set of Twitter users, which may or may not be what you want. But note that is way different than the set of lambda subscribers, i.e. P(NJ|lambda) ~ 25%? P(NJ|twitter) ~ 3%? P(toby|lambda) = 1; P(toby|twitter) = 144/# of twitter users It may help you to just write out all the combinations (extending the above) and see what you know and don't know. Then you can try to apply Bayes theorem and other formulas to get a sense for what is going on. On Jul 31, 5:11 pm, Steve Eichert <steve.eich...@gmail.com> wrote: > Right, and what if in my example, state doesn't impact the > probability at all. If belonging to Philly lambda is the key > determining factor then taking state into account only throws us out > of whack. > > I'll have to take a look at what's available in excel, perhaps it will > help me understand. > > Steve > > On Jul 31, 2008, at 4:15 PM, yegg <gabriel.weinb...@gmail.com> wrote: > > > > >> So there would be a 50% chance that Jonathan follows Toby given > >> that he's > >> from NJ. So from what I understand, in order to find the > >> probability that > >> Jonathan follows Toby given that he's in Philly Lambda, and he's > >> from NJ I > >> would multiple the probabilities of each together. > > >> 1 * .5 = 50% > > >> So I think that I could say that there's a 50% chance that a person > >> from NJ > >> and in Philly Lambda follows Toby. Is that correct given this > >> simplistic > >> approach, or am I doing something wrong? > > > This assumes the attributes are completely independent of each other. > > Take the case of language attributes, e.g. who uses Perl and Lisp. > > Suppose both were 50% (of the people who follow Toby). By this logic, > > the final probability would be 25%. But what if the exact same people > > who use Lisp also use Perl, then the real answer would be 50% because > > the additional attribute tells you nothing. It would only be 25% if > > they were completely independent. > > >> I don't know anything about the other stuff you mentioned (Bayes > >> classifier, > >> regression analysis) so I'll have to try and read a bit about them > >> and see > >> how I may be able to use them. > > > You can do this in Excel. The help is helpful. Basic linear > > regression is built in. To do more advanced stuff, do Tools->Add Ins, > > add Analysis and Solver. Then you can do Tools->Data Analysis. > >
|
|