Global Circulation Models
This is another topic from our summer workshop.
It comes from Claudia Tebaldi.
We are all familiar with the debate about Global Climate Change
and the general conclusions of the IPCC. This lab has the students
explore the global models used by the IPCC to simulate and predict
climate under different scenarios. Students explore "observed"
climate over a 128 by 64 grid of locations over the surface of the
earth for 6 decades, 1950  2010 (actually 2005). The data are
aggregated across the years within each decade for each month. So we
have a 4 dimensional array with dimensions 128 x 64 x 12 x 6
(longitude x latitude x month x decade). We have these data for both
surface temperature and precipitation.
Next we have the predicted values from the Global Circulation
Models (GCM) for a) these six decades, and b) from 2010 to 2099,
i.e. for the period that overlaps the "observed" climate data
and the future.
We also have
 [ocean] a matrix that identifies which (longitude, latitude)
locations are land or sea;
 [regions] a matrix which identifies which (longitude, latitude)
locations are in each of 22 different geographical regions of
interest;
 a matrix of weights used to adjust the values for area
distortions caused by the curvature of the earth;
 the actual longitude and latitude values;

 Observations
 The students explore the observed data around the world
for different months and familiarize themselves with the
range of values. Precipitation is often taken as a ratio.

 Comparing Models

The next step is to compare the model predictions to the
observed data. The students can visualize this in various
different ways, e.g. over time, geographically at different times.
They can also compute statistics about the differences and try
to see which do well in what parts of the world and in what seasons.
The students try to decide which models do best and which are not
good.

 Future
 The students also compare how the models predict the future
and understand the range of outcomes. They should also see how
the predictions for the entire 150 years evolve for each model,
i.e. characterize the change from the first 6 decades for the
last 9.

 Averaging the models
 Next we move to combining the models. We explore how if we take
an average of all 21 models, the prediction error does go down, but
not at the expected rate of sqrt(21).
This is because the models are not independent.
We can do a simple experiment that has the students
average k models chosen at random and compute the
prediction error. We do this for k = 1, 2, 3, ..., 21
and plot the distributions within each k.
This shows that we need to average the models in a different way.

 Two scientists (Giorgi & Mearns) propose a somewhat ad
hoc approach to use weights for averaging the models.
The algorithm combines how well the model predicts the observed
data and how well it agrees with the other models for the future data.
The algorithm combines how well the model predicts the observed
data and how well it agrees with the other models for the future data.
These reliabilities can be estimated iteratively using an
updating algorithm and a reasonable starting point.
We have the students explore these and try different exponents
that influence the weights.
The students can see which models get high weights and can
compare these to their characterizations from the exploratory phase.


 Hierarchical Bayesian Model

The iterative approach just described is somewhat ad hoc
as there are free parameters that cannot be estimated but
which must be specified by the scientist.
We move to a Bayesian approach to characterize the
future predictions.
We start with the observed values (X_{o}) being Normal with an
unknown mean. The predictions of the observed decades for each
GCM (X_{j})
are modelled as Normal with the same mean but with a different
variance/reliability. Then the future predictions of each GCM
(Y_{j}) conditional on the predictions for the observed
decades (X_{j}) are Normal.
The mean of this Normal has a term which represents climate
change.
We can then generate a sample from an MCMC run for this climate
change parameter. Because of shared parameters in the
hierarchical model, these samples are effectively from a
combination of the different GCMs. We also get an estimate of
the distribution of this change and so can make statements
about the uncertainty.
The computational topics this lab explores include
 data manipulation (arrays, subsetting, collapsing dimensions)
 visualization (distributions, time series, maps)
 writing functions (for the iterative algorithm for estimating weights)
 markov chain monte carlo (MCMC)
 efficiency of the MCMC.
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Wed Jul 22 13:58:21 PDT 2009