On coincidences

                                   
We often see in news or in history, or encounter in our personal experience, events actually happening which seem too incredibly unlikely to be explicable as "just chance". A long and continuing tradition outside mainstream science claims spiritual or paranormal significance to such coincidences or "miraculous" extraordinary events. A shorter tradition of rationalists -- mathematicians and statisticians -- seeks to debunk that claim, arguing that such events are indeed explicable as "just chance" with no other significance. This message
coincidences are more common than you imagine
is conveyed in many popular science books and is invariably illustrated (there, and in textbooks) by the birthday problem. Two recent popular books (David Hand The Improbability Principle and Joseph Mazur Fluke) are dedicated to this topic. I will not repeat matters which have been much discussed elsewhere, but merely add a few thoughts. Incidently, the quote in the first picture is often incorrectly attributed to Albert Einstein.

The rationalist position is simply that there are a gazillion very-unlikely-but-possible coincidences that might happen but don't, and we can't aware of them all, we just notice the occasional one that does happen. Certainly I believe this "rationalist" position, but their claims to have "proved" this are rather disingenuous, and often unconvincing to others, as this humorous tale I tell in talks demonstrates.

                 
Applying mathematics to the kind of personally observed coincidences in the Cambridge Coincidences Collection is extremely difficult, and I will illustrate the issues in the examples below. In brief, when we observe something specific we can try to estimate the chance of this specific coincidence, but there will be some indefinite large number of somewhat similar types of coincidence which we did not observe; identifying these "generic" similar coincidences raises the issue of "how wide to cast our net?" which never has a clear answer.
Example 1. There were 3 passenger jet crashes in 8 days in summer 2014 (Air Algerie July 24th, TransAsia July 23rd, Malaysian Airlines July 17). How unusual is this?
Here the information suggests the specific coincidence "3 crashes in 8 days". One can easily model this; data shows an average of 1 such crash every 40 days, and a math calculation shows this specific coincidence will occur on average about once every 7 years. So it's not at all unlikely. But we can easily envisage variants -- 3 crashes in 1 month involving the same airline, or the same region of the world, or the same airplane model; or 3 airport shutdowns; and so on. So the generic "some coincidence involving air travel" will occur much more often than any specific case .

An everyday type of coincidence reported often in the Cambridge Coincidences Collection is

Example 2. Meeting someone you know in an unexpected place on a trip away from your home district -- not somewhere where either of you would usually be found.
A short article by G.J. Kirby estimates the frequency this should happen “by chance” to himself as follows.
Number of people he knows and would recognize: 212
Total number of people encountered in a typical trip: 460
Number of trips away from home district per year: 30
Adult population (U.K.): 40 million
With these figures his chance of such a "coincidence" meeting in a year is roughly \( 212 \times 460 \times 30/(40 \ \mbox{million} ) = 1/14\). Of course there first two numbers quoted are necessarily vague. And in fact any calculation of this kind is likely to be an underestimate, because the people you know tend to be similar to you and therefore are more likely to be encountered by you than a random person. And there are many "generic" variants of this kind of coincidence. A conversation with a stranger might reveal a common acquaintance; you might see a media reference to some non-famous person you once met; and so on.

Example 3. U.S. District Court Judge (Washington DC) Richard Leon handled 3 cases involving the FDA and tobacco companies.
In January 2010 he prevented the Food and Drug Administration from blocking the importation of electronic cigarettes.
In February, 2012 he blocked a move by the FDA to require tobacco companies to display graphic warning labels on cigarette packages.
In July 2014 he ruled in favor of tobacco companies and invalidated a report prepared by an FDA advisory committee on menthol.
Major cases are supposedly assigned randomly to judges. A journalist asked me: is it just coincidence that one judge would pull these major cases?
Note we are not discussing the merits of the judgments, just the random assignment.

Here it is hard to pin down what even the "specific coincidence" is. The math calculation is that the chance that 3 prespecified cases go to the same judge is about \(1/17.5 \times 1/17.5 \approx 1/300 \) because there were effectively 17.5 judges (some part-time). But what does this imply?

There were over 10,000 cases in the period of interest. Imagine looking at all those cases and looking to see where there is a group of 3 cases which are "very similar" in some sense. To make up some figures, if each case had merely one "unusual feature" which appeared in only 1 case in 400, then some arithmetic shows that there would be around 1 million groups-of-3-cases with the same feature, so 3 thousand of which would be randomly assigned to the same judge. These particular numbers are made up, but make the point that an awful lot of "similar cases to the same judge" can be expected.

However, most of these similarities are uninteresting, whereas the FDA-tobacco issue is unusually interesting. A more relevant analysis would be to go through the 10,000+ cases and find out the number of groups-of-3 that were "very similar" in some sense of interest to an investigative journalist. This is some (small, I presume) number N, and the chance that some group "of interest to a journalist" were all assigned to the same judge (by pure chance) is N/300. Now I have no idea what N is, but experience with other kinds of coincidence says that there are more occurrences and more types of "very similar in an interesting way" than one would imagine. I doubt that anyone could do a defensible quantitative analysis of this case, but to me the observed coincidence is not itself suspicious.

Example 4. There are various reasons why the holder of a winning lottery ticket might wish to (illegally) sell it to another individual rather than claim the prize, and there are reasons why the other individual might wish to buy the ticket and claim the prize. So an individual who claims many prizes might be guilty of this illegal activity, or might just have honestly bought many tickets and been very lucky. Although the basic mathematics of lotteries is simple, deciding if a few individuals out of many million players were "implausibly lucky" is complicated -- see this technical paper.

Mathematical examples

The birthday problem is the prototype for a huge number of "small universe" models of coincidences studied in discrete probability. My old Poisson Clumping Heuristic book gives a general way to think about the calculations in such models. A recent example is
What is the chance that, in an annual 128-player single elimination tournament, some 2 players meet each other in two given consecutive years?