My goal for this class was to learn what tools available in ArcGIS’s Spatial Statistics toolbox might be useful in helping to answer my research questions regarding  spatial-temporal relationships between mountain pine beetle outbreak\spread and coinciding environmental characteristics such as topography, climate, and forest attributes.  I quickly learned that the spatial statistics tools in ArcMap are not suitable for my data.  All tools expect vector data as inputs and I am working with raster image data.  I can do conversions from raster to vector, but it really doesn’t make sense computationally and theoretically.  Also, I found that my image data needed a lot more pre-processing to get it to a point where it could be analyzed than I originally thought.  I spent the majority of the class time preparing my data, which included: georegistration, disturbance detection, and disturbance causal agent identification.  Once I had completed these steps for a pilot image scene I was able to do a quick look at outbreak elevation histograms through time, and also annual climate prior to and during the outbreak.

The remainder of this post will walk through the steps I took to process the imagery and present some initial findings on mountain pine beetle outbreak and environment relationships.

The first step in image preprocessing was to properly georegister mis-registered images.  Please click on the images below to see an example of a before and after image registration correction.

fiximg1 fiximggood

Figure 1. Satellite image before and after georegistration correction (click images to see animation)

The registration process was coded and automated using the R “Raster” package.  The first step was to sample the image with points and extract small image chips or windows from around the point in the reference image and the misregistered image (see figure below).

 

sample

Figure 2. Sample reference and misregistered image and extract image chips from around the sample points.

Within the registration program, for each sample point, the misregistered image chip is “slid” over the reference image chip at all defined x and y offset distances and cross correlation coeffiecients are calculated for each shift based on the intersecting pixel values.  Please click on the figure below to see an example of the moving window cross correlation.

 

itpfind  Click image to see animation

Figure 3.  Animation example of moving window cross correlation analysis between reference and misregistered image chips for iterative xy offset shifts.

A cross correlation surface is produced for each sample point.  The apex of the conical peaks represent the offset that is needed to correct the misregisted image.  A 2nd order polynomial equation is created from all legitimate peaks and used to warp the misregistered image into its correct geographic position.

 

ccc_surfaces

Figure 4. Example cross correlation surfaces showing the best offset matches between the reference image and misregistered image chips

With all images in their correct geographic position, the “LandTrendr” change detection program was applied to the long time series of Landsat imagery to identify pixels that have been disturbed.  A discriminant analysis of empirical variables related to spectral, shape, and topographic characteristics of identified disturbances was conducted to predict disturbance agent from a training set of labeled disturbances.  The figure below depicts a mountain pine beetle outbreak identified in central Colorado.  Click on the image to see the outbreak start and progression (colors represent magnitude change to forest: low-high\blue-red)

 

test click image to see animation

Figure 5: Mountain pine beetle outbreak spread as captured by annual Landsat satellite imagery

From the outbreak progression video above, you can see that the outbreak appears to move up slope.  To find out if this is truly the case I extracted elevation values for all insect affected pixels per year and plotted the histogram of both elevation value density and frequency.  Please click on the images below to see the shift in elevation and area affected as the outbreak progresses.

histdensehistcountClick images to see animation

Figure 6. Animated histograms depicting the progression of mountain pine beetle progression up slope.

I was also curious about what the annual climate was doing before and during the outbreak.  I extracted PRISM climate data from 1985 to 2008 in the region of the outbreak and plotted it with the count of insect-disturbed pixels.  The figure shows that maximum and minimum annual temperature begin to increase 1 to 2 standard deviations from mean about 5 years before the outbreak really takes off.  This corresponds with a 3-4 year drop in annual precipitation.  These conditions could have drought stressed the forests and provided a highly productive climate for the beetle to reproduce multiple times in a season and avoid freeze-kill.

 

climate

Figure 7.  Graph showing annual deviation from mean for PRISM climate variables and insect-disturbed pixels for 23 years.

In closing, I found the ArcMap spatial statistics unable to work with my raster format data, but was able to make a lot of progress in data preparation and analysis and exploration of satellite image detected insect outbreaks and corresponding environmental factors.

 

 

 

When analyzing data it is important to have a basic familiarity with the data structure.  With tabular data this often means creating histograms and scatter plots to visualize the structure and relationship between point values.  Also useful are knowing descriptive statistics such as minimum, maximum, mean, and standard deviation values.  Familiarity with spatial data should include measures of their geographic dispersion, autocorrelation, and value aggregation.  Within ArcGIS these characteristics can be measured using “Average Nearest Neighbor”, “Spatial Autocorrelation (Global Moran’s I)”, and “Hot Spot Analysis Getis-Ord Gi*)” tools, respectively.  In this example I look at the spatial structure of a sample of satellite image-mapped forest disturbances in Oregon’s west Cascades.  The data are polygons representing unique disturbance events, with attributes including: year of disturbance detection, magnitude of disturbance, and duration.

1.  Average nearest neighbor.

Magnitude of disturbance was divided into three classes (low, medium, and high).  Each class was run through the average nearest neighbor tool to determine if the spatial pattern is clustered, random, or dispersed.  The pattern for low magnitude disturbance is random, whereas medium and high are clustered.  This pattern of disturbance severity and its distribution is possibly a function of the disturbance agent.  Low magnitude disturbances are typically natural, which may be more random than anthropogenic disturbances, like clearcuts, which dominate the medium and high magnitude classes.  Note that nearest neighbor analysis is highly sensitive to the data extent.  A larger of smaller extent, would likely change the result, therefore the stated results are only meaningful for the area and extent used, not an indication of universal pattern.

2.  Spatial autocorrelation (Global Moran’s I)

Global Moran’s I was applied to disturbance magnitude (without classification based on severity).  Global Moran’s I indicated that the disturbances are clustered by magnitude.  This means that there is autocorrelation within data, where disturbances close to one another have similar magnitudes.  The results are the same as nearest neighbor evaluated by severity classes, except that magnitude was explicit in the analysis with Global Moran’s I (no classification needed).  The interpretation is the same as that for nearest neighbor.

3.  Hot spot analysis tool (Getis-Ord Gi*)

Getis- Ord Gi* calculates a z-score that relates to the clustering of either high or low valued features.  The results, based on the entire range of magnitudes, shows significant clustering of high values, but not of low values, which is consistent with nearest neighbor analysis.  The areas showing greatest significance of high magnitude clustering have relatively large gaps between neighbors, which could be a consequence of the “look-to-distance” of the analysis.

Picture1

 

 

My spatial problem:

Under what forest and climatic conditions do endemic mountain pine beetle populations switch to epidemic populations?

The null hypothesis is that conditions are random and alternative hypotheses include: 1) population eruptions are simply cyclic or periodic; 2) there are specific environmental condition triggers; 3) some combination of alternative hypotheses 1 and 2.

Mountain pine beetle survival is dependent on availability of susceptible hosts and suitable temperature range, with the primary limitation being minimum temperature.  In an endemic state, mountain pine beetles may kill several trees in a dispersed pattern, while in an epidemic state, nearly continuous, widespread host tree mortality is observed.  Population eruptions exhibit both temporal and spatial patterns over the landscape making spatial statistics a useful analysis tool.

There are two parts to the study: 1) outbreak detection and monitoring using 40 years of Landsat satellite imagery and 2) analysis of relationships between outbreak initiation and spread and forest and climate conditions

Independent variables include:

Host availability at time of outbreak:

Host density

Host age

Topography:

Elevation

Slope

Aspect

Climate at and prior to time of outbreak:

Min and max temperature for various time periods

Precipitation for various time periods

Forest structure:

Composition

Management

Disturbance history

Each of these variables could potentially be related to outbreak timing and position through geographically weighted regression.

As a general introduction to what I can expect from spatial statistics I searched for a webpage that would define what spatial statistics are, what kinds of questions they can answer, and how they are different from a-spatial statistics.  I found a document entitled “Understanding Spatial Statistics in ArcGIS 9” (http://www.utsa.edu/lrsg/Teaching/EES6513/ESRI_ws_SpatialStatsSlides.pdf) that answers these questions.

The document begins by answering the question “What are spatial statistics?”  The author defines them as “exploratory tools that help you measure spatial processes, spatial distributions, and spatial relationships.”

There are two categories of spatial measurements:

1)      Identifying characteristics of a distribution.  This first category of measurements is descriptive, answers questions like: where is the center, or how are the features distributed around the center?

2)      Quantifying geographic pattern ie are the data random, clustered, or evenly dispersed.

Spatial statistics are different from a-spatial or non-spatial statistics in that spatial statistics include some measure of space in there mathematics.  In most cases, neighboring observations are considered in the statistics regarding a focal observation or global measurement.

The document describes a few examples of problems or questions addressed using spatial statistics available in ArcGIS:

1) How does the distribution of Dengue Fever for a village in India change during the first three weeks after the outbreak?

2) Does bobcat movement between preferred habitat areas coincide with natural land features such as valleys, rivers, or ridgelines?

3) Are there persistent areas in the United States where people are either dying earlier, or living longer, than the average American?