class: blueBack ## How to Tell if an Election Has Been Hacked ### Philip B. Stark #### Department of Statistics, University of California, Berkeley #### http://www.stat.berkeley.edu/~stark @philipbstark ### .white[Nerd Nite East Bay] ### Oakland, 26 November 2018 --- .center[
] --- ### Arguments that US elections can't be hacked: .medium[ - Physical security - Not connected to the Internet - Tested before election day - Too decentralized ] --- ### Arguments that US elections can't be hacked: .medium[ .red[ - Physical security + "sleepovers," unattended equipment in warehouses, school gyms, ... + locks use minibar keys + bad/no seal protocols, easily defeated seals + no routine scrutiny of custody logs, 2-person custody rules, ... ] - Not connected to the Internet - Tested before election day - Too decentralized ] --- ### Arguments that US elections can't be hacked: - .red.strikethrough[Physical security] .red[ - Not connected to the Internet + remote desktop software + wifi, bluetooth, cellular modems, ... + removable media used to configure equipment & transport results - Zip drives (now only available used) - USB drives. Stuxnet, anyone? + parts from foreign manufacturers, including China; Chinese pop songs in flash ] - Tested before election day - Too decentralized --- ### Arguments that US elections can't be hacked: - .red.strikethrough[Physical security] - .red.strikethrough[Not connected to the Internet] .red[ - Tested before election day - Dieselgate, anyone? ] - Too decentralized --- ### Arguments that US elections can't be hacked: - .red.strikethrough[Physical security] - .red.strikethrough[Not connected to the Internet] - .red.strikethrough[Tested before election day] .red[ - Too decentralized - market concentrated: few vendors/models in use - vendors & EAC have been hacked - demonstration viruses that propagate across voting equipment - "mom & pop" contractors program thousands of machines, no IT security - changing presidential race requires changing votes in only a few counties - many weak links ] --- ## Evidence-Based Elections: 3 C's .medium[ + Voters .blue[_CREATE_] complete, durable, verified audit trail. ] -- .medium[ + LEO .blue[_CARES FOR_] the audit trail adequately to ensure it remains complete and accuirate. ] -- .medium[ + Verifiable audit .blue[_CHECKS_] reported results against the paper ] ---
--- ## Risk-Limiting Audits .medium[ + Endorsed by NASEM, PCEA, ASA, LWV, CC, VV, ... ] -- .medium[ + Large chance of requiring a full hand count of the votes, if that would show the outcome is wrong. (Full hand count corrects wrong outcomes.) ] -- .medium[ + Most efficient options: **ballot-polling** and **ballot-level comparison** ] --- .medium[ + 255 state-level pres. races, 1992–2012, 10% risk limit - BPA expected to examine **fewer than 308 ballots** for half. ] -- .medium[ + 2016 presidential election, 5% risk limit - BPA expected to examine **~700k ballots nationally** (\\(<0.5\\)%) ] --- #### Ballot-Polling v. Ballot-Level Comparison, 2 Candidates, 10% Risk Limit | Winner's share | median BPA | 90th %tile BPA | comparison audit | |:---------- | ------: | ------: | -------:| -------:| |70% | 22 | 60 | 8 | |65% | 38 | 108 | 17 | |60% | 84 | 244 | 25 | |58% | 131 | 381 | 31 | |55% | 332 | 974 | 50 | |54% | 518 | 1,520 | 63 | |53% | 914 | 2,700 | 83 | |52% | 2,051 | 6,053 | 125 | |51% | 8,157 | 24,149 | 250 | |50.5% | 32,547| 96,411 | 500 | --- ## Risk-Limiting Audits .medium[ + ~30 pilot audits in CA, CO, IN, OH, VA, DK; MI 12/2018 + CO law in effect; RI 2019; CA has pilot laws ] --  --- ## RLA pseudo-algorithm -- .large[ ``` while (!(full handcount) && !(strong evidence outcome is correct)) { audit more } ``` ] -- .large[ ``` if (full handcount) { handcount result is final } ``` ] -- .medium[Chance RLA won't correct wrong outcome \\(\le \alpha\\), pre-selected **risk limit**. "Wrong" means full handcount would belie it. ] --- .medium[ Magic shop claims coin will land heads 2/3 of the time. ] -- .medium[Can you test whether chance of heads is > 1/2?] -- .blue.medium[HHHHHHHHHH …] -- .red.medium[Is that strong evidence that the coin isn't fair?] -- .medium[What about .blue[THHHHHHTTH …]? ] --- .medium.blue[ + According to shop: - \\(\Pr(\mathrm{H}) = 2/3\\) - \\(\Pr(\mathrm{T}) = 1/3\\) ] -- .medium.red[ + If coin is fair: - \\(\Pr(\mathrm{H}) = 1/2\\) - \\(\Pr(\mathrm{T}) = 1/2\\) ] --- SPRT \\(t = 1\\) -- .medium[After each toss, ] $$ t \leftarrow t \times \frac{\Pr(\mbox{outcome if coin is fair})}{\Pr(\mbox{outcome if shop is right})}$$ -- .medium[ + If toss lands heads, \\(t \leftarrow t \times \frac{1/2}{2/3} = t \times 1/3\\) ] -- .medium[ + If toss lands tails, \\(t \leftarrow t \times \frac{1/2}{1/3} = t \times 3/2\\) ] -- .blue.medium[Theorem (Wald, 1945):] If the coin is fair, the chance that \\(t\\) ever gets below \\(p\\) is at most \\(t\\) --- ### _P_-values .blue[HHHHHHHHHH]: \\(t=(1/3)^{10} = 0.000017 \\) .blue[THHHHHHTTH]: \\(t=(3/2)^3(1/3)^7 = 0.0015 \\) --- ## Coins to ballots .medium[Box model if store is telling the truth:] ``` | 1 | | 0 1 | |________| ``` -- .medium[Box model if coin is fair:] ``` | | | 0 1 | |________| ``` --- .medium[Consider names instead of numbers: + "1" = vote for the reported winner + "0" = vote for the reported loser ] -- .medium[If winner really got 2/3 of the votes:] ``` | winner | | 0 1 | | loser winner | | 0 1 | | winner | | 0 1 | | loser winner | --> | 1 | | winner | | 1 | | loser winner | | 1 | |_________________| |_________________| ``` -- .medium[If outcome was really a tie:] ``` | loser winner | | 0 1 | | loser winner | | 0 1 | | loser winner | --> | 0 1 | | loser winner | | 0 1 | |_________________| |_________________| ``` --- ### More than two candidates, undervotes, overvotes, ... ``` | Alice Chris | | | | Bob Alice | | | | Bob Chris | | | | Donna Elle | | | | blank | | | | invalid blank | |____________________| ``` -- + .medium[test one reported (winner, loser) pair at a time] -- + .medium[condition on showing a vote for one member of the pair] -- + .medium[no multiplicity issue: conjunction, not union] ---
---
---
---
---
---
--- .blue.medium[Markov's inequality] .medium[If \\(X\\) is a non-negative random variable, then for all \\(a > 0\\),] $$ \Pr \\{ X \ge a \\} \le \mathbb{E}X/a.$$ --
-- $$\mathbb{E}X \ge (1-p) \times 0 + p \times a $$ $$ p \le \mathbb{E}X/a $$ --- .blue.medium[Kolmogorov's inequality] .medium[If \\(\\{X_j \\}\\) is a non-negative Martingale, then for all \\(a > 0\\),] $$ \Pr \\{ \sup_j X_j \ge a \\} \le \mathbb{E}X_1/a.$$ I.e., Markov's inequality applies simultaneously to every term in a non-negative Martingale. -- .medium[Likelihood ratios are non-negative Martingales wrt denominator.] --- #### "Super-simple" ballot-level comparison audits via Kolmogorov's inequality $$ n \ge \frac{ a + b o_1 + c o_2 - d u_1- e u_2}{m}$$ -- .medium[ \\(a, b, c, d\\) depend on risk limit & desired operating characteristics \\(o_1, o_2\\) one-vote and two-vote overstatements \\(u_1, u_2\\) one-vote and two-vote understatements \\(m\\) "diluted" margin (includes all ballots) ] --- ## Sampling ballots: requirements .medium[ + ballots (25% of US voters don't have) + ballot manifest + good, transparent, verifiable source of randomness - 20 public rolls of transluscent 10-sided dice + quality, replicable PRNG - SHA256-based cryptographically secure PRNG, open source $$ X_j = \mathrm{int}(\mathrm{SHA}(\mathrm{seed},j))$$ ] --- 