Beyond mere dice: thoughts on what mathematical probability tells us about the real world

Notwithstanding critiques of listicles, this introduction to the "Probability in the Real World" project is presented here as an overview of the subject via bullet points. Nothing of this kind is ever new, but existing general discussions of probability strike me as rather nebulous philosophy or too narrowly focussed in content. So I try to give declarative statements and to cover a lot of bases. Links go to both internal and external pages. I only write pages myself when I have something to say that's not easy to find elsewhere, and here is an index to the "internal" pages written by me .

Well-known mathematical aspects of chance in everyday life

Here are a few of the many familiar ideas from popular science style books and textbooks.

Coincidences are more likely than intuition suggests. True, but note some provisos.
One cannot implement Bayes' rule intuitively: it's just too complicated to combine likelihoods and base rates correctly.
The regression fallacy incorrectly ascribes a particular cause to what is a purely statistical effect. See my favorite example involving sports teams.
The gambler's fallacy incorrectly assumes that bad luck is more likely to be followed by good luck. Nope.
Cognitive biases like the preceding ones have been widely described, e.g. in Kahneman's Thinking, Fast and Slow. So I won't give systematic discussion, just a little commentary.
The first math fact about gambling is that what matters is expectations, not just probabilities of outcomes. But introductory textbooks mislead if they don't mention the second fact (with large gains or losses, what matters is expected utility) or the third fact (in a multiplicative context like stock market investing, use the Kelly criterion).
The most useful basic math fact relevant to strategy in sports is play conservatively if ahead and/or more skillful, play boldly if behind and/or less skillful. Ironically, popular science style books like to discuss strategy in the context of game theory, which is much less relevant to actual sports. Indeed I have only ever found one accessible source of new data relevant to game theory, studied in this paper.

Lesser-known lessons from mathematical probability

To make money long-term on the stock market, you don't need to know whether the market is going to go up or go down. The bottom line of the Kelly criterion is that all you need to know is the probability distribution for how much it will go up or down -- that determines your "optimal portfolio". Of course that leaves open the tough problem of choosing a probability distribution, but an opinion-free choice is to assume the future will be statistically similar to the past.
How can one assess probabilities of unique future real-world events? There is no magic formula, but prediction tournaments show that, after recording assessed probabilities and seeing outcomes. one can objectively conclude that some people are better than others at assessing probabilities. The conceptual point is that one can do this even though no-one will ever know the true probabilities (unlikely events do happen, after all).
Consider updating probabilities as more information becomes available -- opinion polls for the winner of an upcoming election, for instance. Intuition says the probabilities should move in one direction, toward zero or one. But this "picture becomes clearer" metaphor for updating probabilities is wrong. The characteristic jagged shape of stock prices is a general feature of slowly varying assessments of likelihoods of future events, not specific to finance. So (while avoiding the "cognitive bias" of paying too much attention to more recent info) one shouldn't be reluctant to update probabilities both up and down.
Suppose you are prepared to act upon a belief that some ongoing financial market -- tulip mania or the dot-com bubble or the 2017 - ??? cryptocurrency maybe-bubble -- is a bubble, that is the price is irrational and will return to an earlier, much lower, rational level. Then (assuming you are correct) surely you can guarantee to make money by selling short? Alas you cannot. (I'm not saying you can't make money, just that you can't guarantee to make money even if you're correct). In brief, as explained here there is a small chance that rational behavior will lead to bubble-like price trajectories, and (because you don't see the bubbles that never happened) you can't infer that an apparent bubble is not just an unlikely rational price move.
What is the cost of an error in estimating a probability? Although clearly context-dependent, in typical contexts other than unlikely events with large consequences, the cost of an error in estimating a probability scales as (error)^2. This throws a different light upon some freshman statistics. The error is estimating a probability from \(n \) independent observations is order \(n^{-1/2}\), but the cost of the error is order \(n^{-1}\).
A rare example of "something for nothing" -- getting testable numerical predictions without specific real-world input -- is the serious contender principle. In a contest with many contestants and no initial strong favorite, track over time the probabilities (from gambling odds, say) for each to win. Math says that on average 3 other contestants, in addition to the actual winner, will at some time have been given more than a 25% chance to win.
I don't write much about familiar "games of chance" because this topic is treated in many other places, but here's one comment. Because playing cards are so familiar, we overlook one of their remarkable fatures, that they are easy to mix (by shuffling). Other physical objects are much harder to mix (to make them almost completely random, in some sense) than we imagine. For instance a student project shows that if you put 100 business cards into a box and shake vigorously until your arms are tired, they still won't be well mixed.

I give more instances as mathematical curiosities.

Perception of chance in everyday life

My list of popular obsessions -- topics which attract more attention and argument than they deserve -- includes the mystique of card counting and the mystique of outstandingly successful investors and hot hands.
We couldn't find academic literature concerning how people think unprompted about chance in everyday life, so we gathered data ourselves, which you can see starting from this cover page. The bottom line from this exercise was real people's interests are very different from both textbook and popular science accounts of probability. In talks I gleefully contrast these with hypothetical examples from an elderly philosopher.
In particular we were surprised that quite a lot of thoughts concerned reflections on past events as lucky/unlucky or likely/unlikely.
Popularized in Taleb's Black Swan under the name narrative fallacy is the idea that recounting past events as a sequential story "causes us to see past events as more predictable, more expected, and less random than they actually were". Some comments.
It has often been observed that people tend to attribute their successes to skill and their failures to bad luck: see e.g. Robert H. Frank Success and Luck: Good Fortune and the Myth of Meritocracy.

Risk to Individuals: Perception and Reality

Risks, in the non-technical sense of dangers, is one of the main components of our everyday perception of chance. I give a lecture on 4 aspects:

interpreting data about specific risks
psychology -- how we perceive different risks
public policy (regulation)
communicating data on risks to the public

but I don't have anything original to say on these topics. This is a good place to remember

news is fiction (statistically speaking)

Following the man bites dog adage, news records what is unusual, and so is unrepresentative of reality. The extent of media coverage of different risks bears little relation to their actual prevalence, but instead is driven by the psychological factors which make risks seem more threatening or less threatening than they really are. See Ropeik's book How Risky is it Really? for an interesting discussion of those factors. This also colors our judgment of long-term changes in society, which are not represented in daily headlines.

Other pages related to risks is What really has a 1 in a million chance? and Solution #76 to the Fermi Paradox.

On mathematical models

Aside from very simple "games of chance" contexts, estimating probabilities in a somewhat objective way requires a mathematical model, which (roughly speaking) involves both assumptions quantifying where chance enters, and a specification of how observable quantities arise from the underlying chance and non-random ingredients. The often-quoted line all models are wrong, but some are useful is more memorable than helpful. I prefer

all models are fiction,

for reasons explained here. I like toy models -- a simplistic model which we don't pretend will give numerically accurate predictions but instead gives insight whether some mechanism might possibly explain some qualitative observation. But my discussion of use and abuse of toy models emphasizes the following point.

The fact that you can reproduce data using a mathematical model does not imply that the model indicates the actual mechanism.

Another under-appreciated point is that information mediated by human choice is hard to model. What do you learn from the 6 top headlines in today's news? It is not just these 6 facts, but also an almost infinite amount of implicit information about what did not happen. A twins example underscores the point: in the context of probabilities,

"knowing a certain piece of information" is not always the same as "being told this piece of information by a truthful person".

Finally, fields such as Decision Theory and Game Theory rely on different payoffs being translated into a common "utility" number, which (outside of "only money" contexts) seems overly difficult to do in practice.

Conceptual foundations of probability

The first four points below are articulated here at greater length.

A qualitative sense of likelihood -- the conscious recognition of some future events as likely and some as unlikely -- is part of the common sense that the human species is endowed with. Trying to define probability in terms of something simpler is as futile as trying to define time in terms of something simpler.
Even though we can sometimes confidently deem one event to be more likely than another event. it does not necessarily follow that it is possible or useful to assign numerical probabilities to events. By analogy, a broken leg is more painful than a pinprick, but we do not seek to assess it quantitatively as 10 times or 100 times as painful.
The money analogy is helpful. Textbooks and Wikipedia say that money is a medium of exchange and a store of value and a unit of account. No-one has difficulty understanding that the same thing can have different uses. But the usual treatment of Interpretations of Probability discusses Logical probability and Subjective probability and Frequency Interpretations and Propensity Interpretations as if these were alternate theologies of which only one could be true. But this is silly. In discussing money, the issue isn't what money is, the issue is what money is for. Analogously the issue is what probability is for, and having different ways of thinking about what it is for is a feature, not a bug.
Empirical studies of prediction markets provide a touchstone for rejecting extreme philosophies of Probability.
To me the fundamental conceptual question is in what real world contexts is it both practical and useful to attempt to estimate numerical probabilities? In many ways that is the underlying theme of the "Probability in the Real World" project. Note some important examples where it is not practical/useful.
Is it practical and useful to distinguish aleatoric and epistemic uncertainty?
When thinking about the meaning of chance is some particular context, it is helpful to ask yourself chance is the opposite of ???. The physics answer (opposite of deterministic) is one extreme; opposite of under one's control is another extreme. See other opposites here.
It is also worth asking why do we care about probabilities in a given situation.
Dice are greatly overused, both as a verbal metaphor for, and as a visual image for, randomness. By being "pure chance" they (and other "artifacts with physical symmetry used in games of chance") are completely unrepresentative of real-world uncertainty. Retire "dice" as the icon for randomness!. To quote Taleb, the sterilized randomness of games does not resemble randomness in real life.

Some common misunderstandings

It is sometimes said to be surprising that mathematical probability was not developed earlier (by the ancient Greeks, for instance). But the opposite is true: surely probability was the first not-directly-measurable concept to be quantified.
The real significance of the Kolmogorov axioms is widely misunderstood. While Probability certainly involves some conceptually extra idea (relative to the rest of Mathematics), the issue was whether Probability required some new technical ingredient to be added to the rest of Mathematics. Kolmogorov's achievement was the realization that it didn't. Probability distributions are (technically) just measures, and random variables are (technically) just measurable functions.
There are alternatives to the Kolmogorov axioms, though none has found widespread acceptance.
The notion of "information" in mathematical probability is different from our everyday notion, and hard to apply in the context of real-world future events. See general discussion and a specific example.
The blinkered mathematician principle states
Mathematicians are good at mathematics; they are often quite good at intellectual matters unrelated to mathematics; but they are often quite bad at matters slightly related to mathematics:
blinkered in the sense of narrow focus on the mathematics without asking how it fits the real-world context. Amongst examples involving chance are completely nonsensical statements like "there is no element of chance in chess" or "gambling in a casino is foolish because you are likely to lose money".
People often think that bookmakers adjust their offered odds so that, whatever the outcome, they never lose money. This just isn't true. Here's the conceptual explanation.
Good advice to non-professional stock market investors includes "don't attempt short term market timing". Alas this is sometimes conveyed via the true but misleading observation most stock returns are made on relatively few trading days, so it is important not to be out of the market on those days. This is no more relevant than an old commercial for the California lottery: what if your numbers won and you didn't play them?. Going in and out of the market, you would miss some of the down days too. Mathematically, doing this and being "in" 70% (say) of the time is less desirable than simply keeping 70% of your money in the market all the time (same mean, higher variance). Unless you can predict the future.

On paradoxes

Many of the widely discussed paradoxes strike me as too artificial to be worth discussing. Instead I do like questions we can say in words but cannot translate into precise mathematical questions .

Fantasy settings

In a toy models of a real world phenomenon, we acknowledge lack of realism of details of the model, and seek only to know if some proposed mechanism might possibly be true. There is a separate genre which (to me) is concerned with imaginary situations, and which I therefore call fantasy. Just like literary Fantasy (which in principle could be about anything but in practice fits some established subgenre) these examples tend to fit some category such as

Artificial math problems, impractical or pointless to implement.
Naive appeals to informationless priors.
Speculations about the nature of reality.

The purpose of the latter page is to outline my diagnostics for "what to dismiss as fantasy" before or during reading it.

On predicting the future

My general discussion here emphasizes a few points. In fiction one can speculate on some particular future for human society. To me it seems self-evident that non-fiction predictions or forecasts about the future should be expressed in probability terms, but this is surprisingly seldom done. And indeed some apparently-probabilistic graphics are not so -- see the second graphic here. Substantial data from recent prediction tournaments shows that, for short-term (< 1 year, say) geopolitics, some people are better than others at assessing probabilities, but it is hard to assess accuracy in absolute terms.

Slightly off topic, here is my review of a 2013 collection of 155 short essays on the theme What Should We Be Worried About? As usual these are not probability assessments, and indeed rarely even mention likely or unlikely, but as time passes one can check whether any of these worried-about futures has indeed happened.

A challenging question is: what aspects of the more longer-term future, and how far ahead, is it reasonable to try to predict via numerical probabilities? Two thoughts.
(i) It is easy to make casual assertions like "No-one predicted the end of Soviet domination of Eastern Europe in the late 1980s" but it is better to view this as a 5%-chance event that in fact happened.
(ii) Contrary to the Black Swan thesis that the modern world was shaped by dramatic unexpected events, the majority of differences between today and one or two generations ago (e.g. increase in childhood obesity, increased consumption of espresso, increased proportion of occupations requiring a College education, increased visibility of pornography, and the manifest consequences of Moore's Law) are the result of ongoing slow and steady change. And these are hard to formulate predictions for.

This leads to a rather counter-intuitive thought. Regarding the world 50 years ahead, aspects like total population or extent of climate change (often regarded as hard to predict) are in fact easier to predict than aspects like proportion of world under democratic government . Finally it is fun to describe various Global Catastrophic Risks in class and discuss the question how much effort should we put into prevention or ameliation of any given risk?

Probability and Statistics

The phrase Probability and Statistics for an academic area has been in wide use for several generations. A modern view (see link in (i) below), as we enter the Age of Data Science, is

[Classical mathematics statistics] assumes that the data are generated by a given stochastic data model. [Machine learning] uses algorithmic models and treats the data mechanism as unknown.

So classical statistics asks whether data is consistent with a probability model, and answering that question involves the mathematics of probability. Discussing the relation further would require another web site.

A good "over the shoulder" look at how an academic statistician uses theory and data is Andrew Gelman's Statistical Modeling, Causal Inference, and Social Science blog. I myself do not engage in technical statistical analysis, because my world has been full of people who can do it better, so there is no technical statistical analysis on this site. Instead, here we seek to explain conceptual aspects of probability that can be illustrated by real data without needing any sophisticated analysis.

Having said that, here are a few comments on Statistics.
(i) Popular science books relate a perceived historical clash between frequentist and Bayesian statistics, but these are not the actual two faces of Statistics.
(ii) In accord with the saying "if all you have is a hammer, then everything looks like a nail", the widespread inappropriate use of tests of significance was one of the great scientific disasters of the 20th century, unfortunately continuing into the 21st. On the Bayesian side, the cult of informationless priors also has some misguided devotees.
(iii) At the freshman level, it has always bothered me that no-one asks the basic question when is it reasonable to regard data as a sample (an i.i.d. sample) from some unknown distribution? If I have numerical course scores (homeworks plus exams etc) for 50 students in a class, it does seem reasonable; if I have populations of the 50 States of the USA, it does not seem reasonable. But what are the criteria here?
(iv) And here is a skeptical quote from my late colleague David Freedman.

My own experience suggests that neither decision-makers nor their statisticians do in fact have prior probabilities. A large part of Bayesian statistics is about what you would do if you had a prior. For the rest, statisticians make up priors that are mathematically convenient or attractive. Once used, priors become familiar; therefore, they come to be accepted as ``natural" and are liable to be used again; such priors may eventually generate their own technical literature. Similarly, a large part of [frequentist] statistics is about what you would do if you had a model; and all of us spend enormous amounts of energy finding out what would happen if the data kept pouring in.

Probability in Science

Probability plays a role across a broad range of Science. A glimpse of this range can be seen via my map of the world of chance categories and in the Why do we care about probabilities? page. Two of my written-up Berkeley lectures do concern science: Coding and entropy and From physical randomness to the local uniformity principle. Books on my non-technical books relating to Probability list often touch upon science topics, though curiously none attempt a broad overview, even amongst those listed under "science topics". But I have not written web pages on science topics because I don't have any novel expository ideas or novel data.

On teaching Probability

Here and elsewhere I criticize the teaching of a first course in Probability, but I must confess up front that I can't do it better: my Real World course assumes students have already taken this first course.

As discussed here, my criticism is lack of real examples compounded by a plethora of manifestly unrealistic examples. In principle there is nothing wrong with illustrating math concepts via made-up stories, but in practice there is a moral hazard in implicitly teaching that it's OK to ignore realism of models, which may explain some abuse of toy models. My list of math probability predictions which are are actually verifiable is embarrassingly short.

The issue has always been "the math takes over from the concepts". These suggestions for instructors of a very elementary course should be more widely known. My Berkeley colleague Ani Adhikari has developed a Probability course Prob 140 for Data Science students, saying

Computational power in Prob 140 allows students to solve problems that are intractable by other methods. Students also explore the standard mathematical theory graphically and by simulation, and thus develop a more firm grasp of the concepts than they might by using math alone.

This should also enable more realistic examples to be studied.

Here is my brief description of Probability treatments in introductory textbooks, popular books and Wikipedia.

Fun to do in class

Make a real-money prediction market trade based on audience consensus; show outcomes of previous bets
What really has a 1 in a million chance?
Warren Buffett's billion dollar gamble
40,000 coin tosses yield ambiguous evidence for dynamical bias
Examples of inane media comparisons for the chance of winning the lottery.
Solution #76 to the Fermi Paradox
When strong priors meet contradictory evidence