Error in Numerical Models Fitted to Data

presented at

DSRC/DARPA Study

Numerical Simulation of Physical Systems:
The State of the Art, and Opportunities for Further Advances

Kick-Off Meeting

19-20 January 1999

P.B. Stark
Department of Statistics
University of California, Berkeley

Outline

Some sources of uncertainty
Lessons I've learned
Tools to measure uncertainty
Recommendations

Some Sources of Uncertainty

Data uncertainties

Measurement error--systematic and random
Approximations/assumptions in data reduction and data processing
Programming errors in the data collection or data reduction
Incorrect probability model for data errors

Model (physical) uncertainty

Approximations to the forward model (e.g., continuum mechanics, ray theory)
Incompletely known forward model (e.g., data-dependent forward model)
Errors in the math

Approximation of the solution

Parametrization / discretization / approximation by finite-dimensional subspace (ideally chosen to make calculations and constraints exact)

Approximate calculation/estimation

Numerical issues
Approximate computations, e.g., numerical integration, numerical PDE solvers (also treat as discretization)
Iterative or stochastic algorithms: reproducibility of computations, sensitivity to starting guess, sensitivity to termination criterion, etc.
Programming errors

My (Limited) Experience

I've worked on problems in cosmology, demography, earthquake prediction, electroencephalography, fishery management, geochemistry, geomagnetism, gravimetry, helioseismology, IC mask manufacture, seismic imaging, spectrum estimation, tomography, water treatment,

In my experience, the data errors are the most difficult to assess, but the approximation of the solution is the most often completely ignored.

In my experience, coding errors can evade years of testing and debugging, even in fairly straightforward problems.

In my experience, very little can be inferred from data without a priori constraints---knowledge of the subject matter is crucial.

In my experience, what scientists instinctively try to estimate, and what they can estimate, are quite different.

In my experience, scientists often think they need to estimate an entire function (e.g., a distributed parameter of a PDE) to answer the scientific question.

Unfortunately, in every problem I've seen, the bias in function estimates is unbounded unless there are extremely restrictive a priori constraints on the function.

I have never seen a priori constraints sufficiently stringent to reduce the bias in estimating a function to practical levels.

In my experience, many or most scientific questions can be reduced to inference about scalars, rather than inference about functions---but it is rarely done.

I have seen examples in cosmology, geomagnetism, geochemistry, gravimetry, helioseismology, and seismic tomography, in which physically motivated constraints (together with the data) suffice to make inferences about interesting functionals of the model, properly accounting for discretization error and systematic data error.

Scientists and statisticians need each other to hone the scientific questions, to uncover useful constraints, to devise useful, practical experiments, and to understand the limitations of the data.

My greatest feelings of success: geochemistry at SDR and helioseismology with the GONG project.

SDR cooperated in designing the experiments needed to calibrate and test the estimation method (and did the experiments, including repeated experiments to characterize the uncertainty from sample preparation and instrumental noise).
GONG co-opted me into many years of development, testing, and debugging of the data reduction pipeline.

Some Caveats

Assuming that the world is low-dimensional does not make it so.

False heuristics:

The world is linear, and data have Gaussian errors. (Even if it were, data processing tends to corrupt things.)
Experimenters know the size of the random observational errors in their experiments
Experimenters know the size of the systematic errors in their experiments (countless examples of incommensurable experiments)
If the data collection / forward model has limited resolution, the solution needs only limited resolution.
If you don't know something, set it to zero and proceed (the ostrich axiom).
If there is something you would like to estimate, it is estimable.
Ignorance is equivalent to a flat (uniform) prior probability distribution.

More things that mislead:

Reproducing results using the same or similar data and qualitatively similar algorithms (example in seismic tomography of the core-mantle boundary)
Taking the curvature of the objective functional at the putative optimizer as a measure of the overall uncertainty (examples in cosmology and helioseismology)
Reproducing a model from synthetic data, when the synthetic data derive from the same parametrization used in the estimator (examples everywhere)
Reflexive use of standard statistical tests, or use of inappropriate tests (examples in the census, seismic tomography, earthquake prediction, ... )
"Formal" errors (examples everywhere)
Using likelihoods as posterior probabilities.

Tools to Measure Uncertainty

Optimization, optimization, optimization.

Easiest approach: falsification of hypotheses using finite-dimensional optimization.

Math tools to probe infinite-dimensional problems:

Functional analysis, nonsmooth analysis, convex analysis
Optimization in infinite-dimensional spaces: Fenchel and Lagrangian duality, linear and nonlinear programming

Probability, and nonparametric and robust statistics

A general-purpose approach that gives conservative bounds:

set misfit tolerance using sum of bound on systematic data errors, and probabilistic bound on stochastic data errors
bound systematic errors from mis-modeling (Cauchy-Schwarz, mixed norm inequalities, modulus of continuity--sometimes a posteriori bounds are sharper)
find extrema of functional of interest among all models that satisfy the data "adequately" and meet the a priori constraints (when problem cannot be solved exactly, sometimes still can be bounded using duality--bracket uncertainty)

Some statistically optimal procedures for finding confidence intervals are in the class of estimators this gives.

The misfit measure can be tailored for the geometry of the observation functionals, the systematic errors, the constraints, and the functional to be estimated.

Often l₂ measure of misfit (as in least-squares) is not optimal (even can be statistically inconsistent).

This approach can be quite demanding computationally.

Recommendations

Pay more attention to systematic data errors--garbage in, garbage out.
Pay more attention to the distribution of stochastic data errors--GIGO. Characterize data error distributions by experiment.
Use statistically robust or nonparametric methods.
When possible, use discretizations that allow exact computations, exact application of constraints, and error bounds.
Use a variety of algorithms. Take many random starting points for iterative algorithms.
Use more than one approach, and more than one programmer.

Search for a variety of solutions/models.
Get a rabid skeptic on the team.
Distinguish between fitting models to data, and inference. Fitting models to data is exploratory data analysis, not inference.
Don't trust features of models fitted to data.
Formulate hypotheses and attempt to falsify them.

Think carefully about the right questions to ask. Make inferences about scalars--don't try to estimate functions.

P.B. Stark. 21 January 1999