Kriging With ArcGIS  | 
|
|   Adding 
      Layers of Data in ArcGIS : 
        2. After downloading and unzipping the files, add the layers to your new 
        map by opening ArcMap clicking File, Add Data, and selecting the layers 
        you want to add from the appropriate folders. 
        3. If you want labels displayed on your map, you can right-click on the 
        layer and select Label Features. Preparing your data and importing into ArcGIS 
        2. Note that ArcGIS assumes that input x and y coordinates correspond to 
        the x and y coordinates of the coordinate system 
        of the data frame (longitude and latitude if you are using a geographic 
        coordinate system). If this is not the case for your data, you may need 
        to further examine the documentation to see how this might be handled. 
         
        3. For response variables that are constrained to be positive, such as 
        concentrations, a common approach is to take the natural logarithm of 
        the values. This tends to produce data that more closely match the underlying 
        statistical assumptions involved in kriging. This and other transformations 
        could also be done within ArcGIS before doing kriging.  
        4. Your input file should be comma delimited, which means that there should 
        be commas, and only commas, between the values in the file. Each line 
        should pertain to a single location, with the x and y coordinates of the 
        location and the response value, all separated by commas. One way to create 
        this format is in Microsoft Excel. Open the file that you want to convert. 
        Go to File->Save as, and under 'Save as type', select 'CSV'. Hit Save, and Yes to the window 
        that pops up. When you quit Excel, you do not need to save any changes. 
        In S-plus or R, if you use the write.table() function to create the data 
        file, the argument ' sep="," '  will create a comma 
        delimited file. You can also use dbf files, which can also be created 
        with MS Excel.  
        5. Open your map in ArcMap, and go to Tools->Add XY data, click the folder 
        icon on the top right, and browse through the folders to find the .csv 
        file that you just created. Then hit OK. Your new layer should appear 
        in the Layers column on the left. It will also display the layer as individual 
        points on your map.  Overview of Kriging Methodology: Kriging is a statistical technique that posits a certain statistical model for the data, namely that the response at a given location is the sum of two components: an unknown underlying surface, which we are trying to estimate, plus some additional noise. Kriging produces an estimate of the underlying (usually assumed to be smooth) surface by a weighted average of the data, with weights declining with distance between the point at which the surface is being estimated and the locations of the data points. The exact nature of the decline is based on modelling the covariation between data at various spatial locations. Data points, and the associated surface, at nearby locations are assumed to be more similar to each other than data points at locations that are distant from each other. There are many ways to estimate the covariance structure in spatial data and to use this information to create a kriged surface. ArcGIS uses a somewhat ad hoc method, estimating something called the variogram using weighted least squares, and then relies on some further ad hoc approaches to reduce the computational burden of producing an estimate of the underlying surface and estimates of uncertainty in the surface estimate. For a surface estimate based on a default approach to estimating the variogram, one can use the Spatial Analyst. The Geostatistical Analyst is more extensive and customizable: one can either use the default approach or customize the variogram fit and computational details. I would characterize both the default and customizable approaches as being ad hoc and not particularly principled statistically, but they may still produce reasonable surface estimates, since the main factor is that nearby data points be weighted in a reasonable fashion when estimating the surface. One should note that the uncertainty estimates for the kriged surface do NOT account for the uncertainty involved in estimating the covariance structure. Next I describe how to perform kriging in both Spatial Analyst and Geostatistical Analyst. Among the advantages of the implementation of kriging in Geostatistical Analyst relative to that in Spatial Analyst are the ability to use the Matern covariance model favored by many statisticians, the ability to handle directionality in the data, and the ability to make plots of prediction errors as a way of assessing uncertainty.  Kriging using Geostatistical Analyst:
      Using the Geostatistical Analyst, one can create a surface estimate based 
        on default modelling of the covariance structure, or one can modify the 
        covariance structure in a number of ways.  
        2. Select Geostatistical Wizard on the Geostatistical Analyst toolbar. 
         
        3. Select the input data (layer) to krige and the attribute (response, 
        i.e., z) variable, as well as the ‘Kriging’ method in the 
        lower left. Then choose ‘Next’. 
        4. (Step 1) Select Ordinary Kriging->Prediction Map. You also presumably 
        have the option of transforming the data, such as using the natural logarithm, 
        if you haven't done so already, though in my setup, ArcGIS gives me no 
        options under ‘Transformation’. Click ‘Next’. 
        5. (Step 2) The next step involves estimating the covariance structure 
        using the empirical variogram or empiral covariogram. The default is the 
        variogram, and there is some evidence in the literature for preferring 
        this when using the weighted least squares approach, as in ArcGIS. I suggest 
        using the K-Bessel (known more commonly as the Matern) model; this has 
        become popular in the spatial statistics literature. In particular, the 
        Matern model tends to produce surfaces that are more smooth locally (on 
        a very fine scale) than some other models (such as the exponential or 
        spherical).  This is often desirable, since the unknown underlying surface 
        can often plausibly be considered to be locally smooth on a very fine 
        scale. I suggest using the values for the Matern model estimated by ArcGIS. 
        If you wish to change the values, you will want to read the documentation 
        for ArcGIS and probably some basic material on spatial statistics. Note 
        that if the estimated value for the ‘Parameter’ of the Matern 
        is less than one, then the estimated surface is not differentiable (0.5 
        is the exponential model and values approaching infinity tend toward the 
        Gaussian model). I also suggest clicking on 'Measurement Error' and sliding 
        the bar to '100% Measurement Error', which indicates that your data are 
        measured with uncertainty. The documentation claims that by doing so, 
        you avoid interpolating the data (having the estimated surface go exactly 
        through your data points), although I have not seen any differences in 
        the estimated surface whether the Measurement Error box is clicked or not clicked. Still, 
        to be safe, I suggest clicking the box. Then click ‘Next’. The one exception to using the default values is that you may want to investigate the possibility that your data have directionality to them, for example if they are influenced by prevailing winds or groundwater flow directions. An easy way to do this is to click on “anisotropy” toward the upper right of the Step 2 box and allow ArcGIS to estimate the necessary additional parameters automatically. To assess whether including directionality in your model is warranted you could compare the root mean square prediction error given in Step 4 for the models with and without anisotropy. 
 6. (Step 3) The next step determines the details of how ArcGIS approximates 
        the surface estimate so that the computations can be done quickly. Theoretically, 
        the estimated surface at any point should be based on all of the data 
        points, but ArcGIS uses only some of the points, indicated in the ‘Neighbors 
        to Include’ field. From a statistical viewpoint, there is more danger 
        in using too few than too many points (since the kriging method essentially 
        ignores data points far from the location at which predictions are being 
        made anyway). I suggest using a sizable fraction of the data, to the extent 
        that it does not take too long to estimate the surface. For example for 
        100 data points, I would try to use at least 25 neighbors, and more if 
        possible. For 1000, I would use at least 25-50 and ideally a few hundred, 
        but the computations may be too slow with this many. I would unclick the 
        ‘Include at Least’ box, which preferentially includes data 
        points based on direction, unless you have modeled directionality on the 
        previous screen. Note that the colored map of points indicates the relative weights 
        assigned to each data point for use in predicting the surface at the focal 
        location at the center of the circle/ellipse. Click ‘Next’. 
        7. (Step 4) The next screen assesses how good your predictions are based 
        on cross-validation (leaving points out of the fitting and then estimating 
        the value and comparing to the actual value). The root-mean-square prediction 
        error can be compared between different models as a way of choosing a 
        model (such as whether to include directionality). The ArcGIS documentation 
        gives more detail about this. Click ‘Finish’ and then ‘OK’ 
        to produce the surface map. 
        8. Your new layer should now appear on the map. If you would like to 
        change the colors, you can right-click on the layer from the Layers column 
        and go to 'Properties'. On the top menu, select 'Symbology'. You can then 
        change the colors in 'Color Ramp'. You can also change the number of colors 
        that appear on the map by selecting a different number from the drop-down 
        menu.  
        9. Estimating the uncertainty in your predicted surface is an important 
        aspect of spatial analysis. If the uncertainty is so great that peaks 
        and valleys in the surface may not truly exist, then it would be ill-advised 
        to try to interpret those peaks and valleys as representing real features 
        of your data. To estimate the standard errors at each location, as presented 
        in an additional map, use the exact same steps as above, but in Step 1, 
        select Ordinary Kriging->Prediction Standard Error Map. The interpretation 
        of the confidence level indicated by the standard error at each location 
        is that if you repeated the experiment many times, collecting data over 
        and over again, in approximately 67% of the experiments, the true surface 
        at a point would lie within one standard error of the estimated surface 
        produce. Note that this uncertainty calculation does NOT account for the 
        uncertainty in fitting the covariance structure, which might be substantial.  Quick and dirty default kriging using Spatial Analyst:
      Using the Spatial Analyst, one can create a kriged spatial surface using 
        default parameter estimates in ArcGIS. Note that this approach does not 
        handle directionality, for which one would need to use the more extensive 
        options in Geostatistical Analyst.  
        1. Make sure the Spatial Analyst is enabled (Tools->Extensions) and 
        the toolbar is visible (View->Toolbars). 
        2. On the 'Layers' column, select the layer you want to krige. This will 
        usually be the .csv layer you just created.  
        3. Go to the Spatial Analyst toolbar, Interpolate to raster->Kriging. 
        Make sure the correct layer appears in the 'Input Points'. 
        4. On the 'Z Value Field', select the Z variable from the drop-down menu. 
         
        5. Select the ‘Ordinary’ kriging method. 
        6. I would ordinarily suggest using the Matern model, but this is only 
        avaible in Geostatistical Analyst, so my next choice is the exponential 
        model. A theoretical drawback to this model is that the result surface 
        is not particularly smooth (continuous but not differentiable), but the 
        plots I have seen seem reasonably smooth in practice. The Gaussian model 
        (which has nothing to do with whether the z values have a Gaussian distribution) 
        is a possibility, but I have found that the resulting surface can be very 
        unsmooth, probably because of some numerical instability inherent in the 
        kriging calculations based on the Gaussian model. 
        7. In the 'Search Radius Settings', I suggest using at least 25 (more if 
        possible), unless the surface estimate takes too long to produce. This 
        determines how many data points are used in predicting surface values 
        at each location. See my comments about this issue in the discussion of Geostatistical Analyst. 
        Then hit OK. 
        8. Your new layer should now appear on the map. If you would like to change 
        the colors, you can right-click on the layer from the 'Layers' column and 
        go to 'Properties'. On the top menu, select 'Symbology'. You can then change 
        the colors in 'Color Ramp'. You can also change the number of colors that 
        appear on the map by selecting a different number from the drop-down menu.  Some comments on mapping in ArcGIS: ArcGIS provides various colors for mapping. In creating a map, one wants to make sure that a legend is included in any final output, so that viewers can interpret the levels and understand the range of the surface and differences in the response between different levels. This can be done using Insert->Legend. The color scheme should vary in a smooth fashion so that the user does not interpret sharp changes in color to be sharp changes in the surface unless this is truly the nature of the surface estimate. I believe there are principled ways of choosing a color scheme, but I haven't yet looked for further information on this. For surfaces that do not have negative values, one possibility is to use a color scheme such as used on topographic maps. For surfaces with both positive and negative values, I suggest using a blue-red color scheme, with white for values of zero and deepening shades of blue and red for increasingly negative and positive values, respectively. Interpolating methods in ArcGIS: Note that ArcGIS also provides 
        spline and radial basis function (RBF) interpolators that can create spatial 
        surfaces. These methods have the distinct drawback of producing surfaces 
        that go exactly through all the data points, which is generally a bad 
        idea with noisy data, from which we would like to create a smoothed surface. 
        Note that spline and RBF methods can defined in a statistical way to 
        produce smoothing methods that are not forced to interpolate the data 
        (e.g., the methods in the R mgcv library and in SemiPar), but this is 
        not done in ArcGIS.   | 
|