## 40,000 coin tosses yield ambiguous evidence for dynamical bias

### Background

The 2007 Diaconis - Holmes - Montgomery paper Dynamical bias
in the coin toss suggests that in coin-tossing there is a particular
``dynamical bias" that causes a coin to be slightly more likely to land the same way
up as it started.
In brief, whether the coin lands the same way up as it started depends
deterministically on the initial parameters of motion imparted at the instant of
tossing.
Each person's individual "tossing style" gives some probability
distribution on the initial parameters, but (unless the spread is
unrealistically small) it turns out from a careful analysis of the physics that the
resulting overall probability always works out to be 1/2 or greater, though it would
presumably vary from person to person. The basic reason is that, instead of
rotating around a horizontal axis as one might imagine, a typical tossed coin is
rotating around a tilted axis which is precessing in 3-space, and this entails a
certain degree of ``memory" of the initial parameters. Combining theory with data
on initial parameters from a small number of tosses obtained via high speed
photography, Diaconis et al gave a rough estimate of a 0.8% bias (i.e. a 50.8%
chance of landing same way up as started) for a typical tosser, and discuss a number
of possible caveats to the theory.
It is important to distinguish this subtle 3-dimensional effect ("precession bias"),
which persists when
the number of rotations is large, from a more obvious 2-dimensional bias when the number of
rotations is small ("few rotations bias" - see below).
However, no experiment with actual coin-tosses has been done to
investigate whether the predicted effect is empirically observed. Diaconis et al
noted, correctly, that to estimate the probability with a S.E. of 0.1% would
require 250,000 tosses, but this seems
unnecessarily precise. Let's work with numbers of tosses rather than percents.
With 40,000 tosses the S.E. for ``number landing same way" equals 100, and the means
are 20,000 under the unbiased null and 20,320 under the "0.8% bias" alternative.
So, if the alternative were true, it's quite likely one would see a highly
statistically significant difference between the observed number and the 20,000 predicted by the null.

And 40,000 tosses works out to take about 1 hour per day for a semester .........

### The experiment

Over the Spring 2009 semester two Berkeley undergraduates, Priscilla Ku and Janet
Larwood, undertook to do the required 40,000 tosses.
After preliminary experimentation with practical issues, there was formulated a specific protocol, described in detail below.
Cutting to the chase,
here is the complete data-set as a .xlsx spreadsheet
(see sheet 2).
This constitutes a potentially interesting
data-set in many ways -- one could compare numerous theoretical
predictions about pure randomness (lengths of runs, for instance) with this
empirical data. For the specific question of dynamical bias, the relevant data can
be stated very concisely:
of 20,000 Heads-up tosses (tossed by Janet) 10231 landed Heads

of 20,000 Tails-up tosses (tossed by Priscilla) 10014 landed Tails

### Analysis

A first comment is that it would have been better for each individual to have done
both "Heads up"and "Tails up" tosses (which was part of the intended protocol, but
on this aspect of the protocol there was a
miscommunication); this would separate the effect of individual tossing style from
any possible effect arising from the physical difference between Heads and Tails.
But it is very hard to imagine any such physical effect, so we presume the observed
difference (if real rather than just chance variation) is due to some aspect of
different individual tossing style.
Applying textbook statistics:

- testing the "unbiased" null hypothesis with the combined data, we get z = 2.45
and a (1-sided) p-value < 1%
- assuming dynamical bias with possibly different individual biases, and
testing the null hypothesis that these two individuals have the same bias, we get
z = 2.17 and a (2-sided) p-value = 3 %

We leave the statistically literate reader to draw their own conclusions. A caveat
is that the experiment did not use
"iconic tosses" (see below), and we can't really distinguish the possible
precession bias from the possible "few rotations" bias, even though there was no
visual indication of systematic difference between the two tossing styles.
Finally, for anyone contemplating repeating the experiment, we suggest getting a
larger group of people to each make 20,000 iconic tosses, for two reasons. Studying
to what extent different people might have different biases is arguably a richer
question that asking about overall existence of
dynamical bias. And if the "few rotations bias" exists then we would see it
operating in both directions for different people, whereas the predicted "precession bias' is always positive.

### Iconic tosses and the few rotations bias

We visualize an "iconic toss" done standing; the coin moves roughly vertically up,
rising a height of 2 or 3 feet, spinning rapidly, and is caught in the open hand at
around the level it was tossed.
The obvious elementary analysis of coin tossing is that a coin lands "same way up"
or "opposite way up" according to whether the number r of full rotations (r real,
because a rotation may be incomplete) is in [n - 1/4, n+1/4] or in [n + 1/4, n+3/4]
for some integer n. When the random r for a particular individual has large spread
we expect these chances to average out to be very close to 1/2; but when r has small
spread, in particular when its mean \mu is not large, one expects a "few rotations
bias" toward "same way up" if \mu is close to an integer,
or toward "opposite way up" if \mu is close to a half integer.

### Detailed protocol

To avoid tiredness when tossing standing up, the participants sat on the floor.
One person did a long sequence of tosses (all starting the same way up) while the other recorded the result directly
onto the spreadsheet.
Tosses where the coin was dropped were disregarded.
Dates, times and person tossing were also recorded on the spreadsheet.
The coin used was an ordinary dime.
Visually, the tosses were typically
rather low (maybe 18 inches high), rotating moderately fast, and angled rather than purely vertical.