What is the significance of the Kolmogorov axioms?

It is often said that the Kolmogorov axioms provide the standard mathematical formalization of probability. This is true, but is not very informative to a non-mathematical reader, so let me try to write a useful page.

Historical background.

Around 1900 the axiomatic approach to mathematics had spread well beyond its classical setting of Euclidean geometry, and the particular question of how to axiomatize Probability was highlighted as part of Hilbert's sixth problem:

Mathematical Treatment of the Axioms of Physics. The investigations on the foundations of geometry suggest the problem: To treat in the same manner, by means of axioms, those physical sciences in which already today mathematics plays an important part; in the first rank are the theory of probabilities and mechanics.

While Probability certainly involves some conceptually extra idea (relative to the rest of Mathematics), the issue was whether Probability required some new technical ingredient to be added to the rest of Mathematics. Kolmogorov's achievement was the realization that it didn't. Measure theory had been recently developed to resolve the technical conflict between the intuitive idea "every region in the plane has some area" and the axioms of set theory dealing with every subset of an uncountable set. This conflict has no conceptual connection with Probability, but Kolmogorov realized that the technical machinery (involved in its resolution) of measures, measurable sets, measurable functions could be reused as an axiomatic setting for Probability. In retrospect, because one special model within Probability is "pick a uniform random point from the unit square", it is clear that any general theory of Probability has to include measure theory, but (to reiterate) Kolmogorov's achievement was the realization that at the technical level it didn't require anything more.

With agreed axioms, mathematicians happily moved on with systematic development of theorem-proof Probability. The firm connection to the rest of theorem-proof Mathematics enabled researchers to use tools from other fields of Mathematics, particularly in the context of limit theorems. More prosaically, it is helpful to have coherent notation covering both discrete and continuous probability distributions and random variables.

Analogy: measure theory as an operating system

The operating systems macOS and Microsoft Windows are different but serve the same purposes: if you write an app that works on one, then you could modify it to work on the other. And ideally an operating system is transparent; a child can use an iPad without knowing it has an operating system. By analogy, inventing and studying a probability model is like writing an app which is implicitly running on the "Kolmogorov" operating system. As it happens there is an alternative (which never became popular) promoted by Edward Nelson in 1987 and based on non-standard analysis. But (roughly speaking) this alternative is "equivalent" in the same way that the operating systems are equivalent: if you can state and prove a theorem in one framework then you can state and prove a modification in the other framework.

The axioms and real-world probability

Why should one think the mathematical setup is relevant to real-world uncertainty? Here we turn to matters of opinion. A trite response is

casinos and insurance companies can and sometimes do go bankrupt, but not from conceptual inapplicability of the mathematics of Probability.

But showing that a theory works in some contexts is not evidence that it will work in all contexts. Mathematicians tend to regard the axioms as self-evidently true, while admitting that in practice one might not be able to apply the resulting mathematics (see below). But this reduces to saying "it works when it works", which is rather vacuous. My own opinion is that the fundamental "philosophical" question is

in what contexts is it both possible and useful to try to assign numerical probabilities to uncertain events

but I don't claim to have a good answer.

Returning to the Kolmogorov axioms, in my mind there are three issues with real-world applicability. First

making a model requires you to prespecify all the relevant events that might happen and assign a probability to every combination of happen/not happen

and you really can't do that except in very limited contexts: consider our geopolitical forecasting examples. Second, comparing dice probabilities with geopolitical forecasting

we are more confident about our abilities to assess probabilities accurately in some contexts than in others

and this "uncertainty about probabilities" is hard to fit into the axiomatic framework. Third,

a probability model is a description of how data is produced, not a prescription for when observed data can be regarded as "random"

whereas our everyday perception of randomness is centered on actual observations.

There has been ongoing work over many decades on genuinely different setups for thinking about uncertainty, and here are some brief comments, but in the contexts of modeling real-world phenomena (outside the quantum setting) I have never seen a convincing use of such alternatives.

Extras

To amplify comments above:

The Kolmogorov axioms are technically useful in providing an agreed notion of what is a completely specified probability model within which questions have unambiguous answers. This eliminates cases like Bertrand's paradox which is simply an ambiguously defined model. But they encourage both a false sense of security (that the act of formulating a model within the mathematical framework somehow guarantees it is a valid representation of the real world phenomenon) and a narrowness of vision (that aspects of the real world that cannot be formulated within the framework are somehow "not probability").

Here is one illustration. Charles Dodgson (better known as Lewis Carroll) posed the "pillow problem": If an infinite number of rods be broken: find the chance that one at least is broken in the middle. Since Kolmogorov, mathematicians interpret this in a particular way which happens to give a different answer (0) than does Dodgson's interpretation (1 - 1/e). But a mathematician who thinks 0 is the "correct" answer illustrates my "false sense of security" point -- that answer depends on several particular conventions, e.g. ignoring the atomic theory of matter by modeling the rod as a continuum. Invoking usual conventions encourages you to overlook the main issue, that the problem has no more real-world meaning than a question about fairies and unicorns.