Twins

The twins example

In a first course in Probability one gives simple examples (e.g. the Monty Hall Problem) where the formal math setup of events, conditioning etc helps you find the right answer. But I assert one can find just as many examples where the formal math setup can lead you astray, bearing in mind the principle

An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem.

Let's make up a hypothetical, though not unrealistic story.

In casual conversation with a stranger in the next airplane seat, you ask "do you have children?" and the response is "yes, two boys". Based on this, what is the chance that the two boys are twins?

This is a fairly realistic conversation. Of course, if you cared about twins you could just ask, so thinking about probabilities here is artificial, but let's do it anyway. We're looking for an approximate answer, and I won't discuss all the common sense approximations being made.

First we need the empirical frequency of a birth giving twin boys, which turns out to be about 5/1000. Second, amongst 2-child families in general, the frequency of "2 boys" must be about 1/4. So we can estimate the population ratio

(number of families with twin boys, no other children)/ (number of families with two boys, no other children) = approx 2%.

(Note the "interesting" implicit assumption here is that having twins rather than non-twins doesn't affect the frequency of having additional children). So we interpret ratio of frequencies as a conditional probability
P(twins | 2 boys, no other children) = approx 2%
and then give "about 2%" as the answer to the original question.

Analysis. This answer is drastically wrong. The information we have is not "2 boys, no other children" but instead the information is the specific form of the response "yes, two boys". A person with twins might well have mentioned this fact in their response -- e.g. "yes, twin boys". And a person with non-twins might have answered in a way that implied non-twins, e.g. "Sam's in College and Jim's in High School". Perhaps the best answer to the original question is

2% times p/q; where
p is chance a person with twins answers in such a way that you can't infer twins
q is chance a person with non-twins answers in such a way that you can't infer non-twins.

Common experience is that people with twins would actually tell you, so let me guess $p = 1/8$ and $ q =1/2$, leading to my answer "0.5%", which I would bet a small amount of money on.

Commentary. The abstract point is that, for doing probability calculations

"knowing a certain piece of information" is not always the same as "being told this piece of information by a truthful person".

In everyday life we realize this. If you ask a politician in government "has unemployment gone up?" and they answer "long-term unemployment has gone down" then you can be fairly confident that short-term unemployment has indeed gone up. The reason students will (I am willing to bet) get the present example wrong is the convention that one should abandon common sense when entering a math class.

Rohit Parikh comments: in analytic philosophy, this is an implicature: "something meant, implied, or suggested distinct from what is said".