I'm no probability expert either, but have you tried solving the
conditional probability formula for P(A)? As in...
P(A|B) = P(B|A)*P(A) / P(B)
P(A) = P(A|B)*P(B) / P(B|A)
Not yet, but I had someone else make this suggestion as well, so I'll give that a whirl.
ASCII text may be a little misleading. Your events are actually
parameterized over X and Y. Normally this would be written B_X (TeX),
as in B with subscript X, to represent the event that Person X is in
the Philly Lambda user group.
ASCII text is actually probably better for me since throwing in all the mathmatical symbols tends to only confusion the situation further for me.
The reason I bring this up is because to figure out P(A|B), we would
take the number of people in Philly Lambda who identify person Y, and
divide by the number of people in Philly Lambda. And this would be
different for each person Y.
The weird thing, which I think may be the cause of your confusion, is
that for some people, we don't know who they identify. We can't
really compute P(A|B) as I described. So do we include them in the
total number of people in Philly Lambda? ... You see what I mean?
Because they are unknown for who they identify, we can't really say.
This is definitely the point at which I started to get confused and loose a hold of what I was trying to figure out.
A simplification might be to exclude the people who did not respond
from the dataset. Use that dataset to compute the probabilities.
Then predict the ones who didn't respond from that. I think this
makes sense because it's like spam filtering. You use all the emails
you've seen before to create the probability predictors. Then you use
those predictors to classify new email, which you really don't know
whether they are spam or not.
Makes sense. I'll take a stab at this approach and see how I make out. Thanks for the thoughts.