Kyle R. Burton on 31 Jul 2008 11:18:02 -0700 |
2 things: first, can we _please_ have more of this kind of discussion on the list, this is an area where I have little to no background and don't understand and I very much value this kind of discussion. Second, Steve - based on what you learn, can you do a presentation / talk about using conditional probability for social networking recommendations (at least what you've posted sounds like this). I'm sure our group will be forgiving about your level of experience - what you'll have learned will be new and valuable to most of us (I'm assuming). Thanks! Kyle On Thu, Jul 31, 2008 at 2:03 PM, Jonathan Tran <jonnytran@gmail.com> wrote: > > On Wed, Jul 30, 2008 at 9:24 PM, Steve Eichert <steve.eichert@gmail.com> wrote: >> A = Person X will identify Person Y >> B = Person X is in the Philly Lambda user group >> >> However, in order to take this approach I believe I would need to know the >> probability that person X will identify person Y, which is what I'm trying >> to figure out. > > I'm no probability expert either, but have you tried solving the > conditional probability formula for P(A)? As in... > > P(A|B) = P(B|A)*P(A) / P(B) > > P(A) = P(A|B)*P(B) / P(B|A) > > ASCII text may be a little misleading. Your events are actually > parameterized over X and Y. Normally this would be written B_X (TeX), > as in B with subscript X, to represent the event that Person X is in > the Philly Lambda user group. > > The reason I bring this up is because to figure out P(A|B), we would > take the number of people in Philly Lambda who identify person Y, and > divide by the number of people in Philly Lambda. And this would be > different for each person Y. > > The weird thing, which I think may be the cause of your confusion, is > that for some people, we don't know who they identify. We can't > really compute P(A|B) as I described. So do we include them in the > total number of people in Philly Lambda? ... You see what I mean? > Because they are unknown for who they identify, we can't really say. > > A simplification might be to exclude the people who did not respond > from the dataset. Use that dataset to compute the probabilities. > Then predict the ones who didn't respond from that. I think this > makes sense because it's like spam filtering. You use all the emails > you've seen before to create the probability predictors. Then you use > those predictors to classify new email, which you really don't know > whether they are spam or not. > -- ------------------------------------------------------------------------------ Wisdom and Compassion are inseparable. -- Christmas Humphreys kyle.burton@gmail.com http://asymmetrical-view.com/ ------------------------------------------------------------------------------
|
|