After tinkering with several different tools and datasets, my end goal for this class became to digitally reproduce my manual methodology for mapping wetlands.

The reason my methodology is manual (rather than digital) is because temporal and spatial context are very important when mapping wetlands in the Willamette Valley using Landsat satellite imagery.

For example, riparian wetlands along the Willamette River have the same spectral signal as poplar farms and orchards in the same vicinity. If using a single date to classify wetlands, a digital classification may confuse land use types. However, looking backward and forward in time, I can tell which areas are stable wetlands and which are rotated crops. Additionally, I have other layers such as streams, soils, elevation, and floodplain to see whether it is logical for a wetland to exist in a certain area.

bgwtime

 

When I classify wetlands I have several different data layers available to aid in decision making:

Spectral

  • (40) annual Landsat tasseled cap images from 1972-2012
“Ancillary”
  • Lidar inundation  raster
  • Streams
  • Soils
  • Others
My options for classifying wetlands using my satellite imagery include:
  • Spectral-only supervised classification -Use time series as additional channels
  • Object oriented classification
  • Incorporate ancillary with spectral
  • Mix

Arc seemed like the perfect application to attempt to incorporate my spectral Landsat data with my other layers.

At first I thought using one of the regression tools on a mapped sample of my study area could allow me to classify the rest of my study area using OLS predictions. However, when I looked at my data, the relationships between datasets did not appear robust enough.

spm

 

Instead, I decided do to a binary raster overlay to find possible locations of  wetlands. I selected a small study area focused around South Corvallis because it had a good mix of both riparian forest stands as well as hardwood orchards.

I included spectral data from a 2011 tasseled cap image using the indices brightness, wetness, and greenness. I calculated the spectral stats for BGW using areas of known riparian wetlands. A new binary raster was created for B,G, and W; pixels with data around the mean +/- one standard deviation were given a “1” and all other pixels were given a “0”.bgw

 

I also included a 1 meter Lidar two year inundation floodplain map. Areas in floodplain were reclassified as “1” with all other space as “0” as well as a 100m distance raster of the streams in the area

lidar streamraster

All layers were given equal weight in the raster overlay. The result was a raster with little to no error of omission but high error of commission.

(Areas of yellow indicate manually mapped and validated riparian wetlands; green is result of raster overlay).

results

 

Just for comparison, I decided to run a spectral classification in ENVI using my manually mapped wetlands as regions of interest (i.e. training sites).

The result presented increased error of omission but decreased error of commission.

specclass

 

You can run spectral classification in Arc but the process is not streamlined and can become convoluted. Additionally, ENVI automatically extrapolated classification of an image based on training data; Arc is clunky when it comes to prediction and extrapolation.

predict

 

My final thoughts on the Arc spatial statistics toolbox are that:

  • Arc spatial stats are (mostly) useless when it comes to categorical/non-continuous data
    • Especially useless when it comes to mixing continuous with non-continuous
  • Raster tools are not multi-channel friendly (I had to process each band separately)
  • Classification and prediction are convoluted, multistep processes in arc
    • Operations that ENVI, R, etc. do flawlessly
  • I should probably expand to other software for serious analysis
    • eCognition, habitat suitability mappers, etc.

One of my spatial problems is examining the spatial distribution of mitigated wetlands in the Willamette Valley to examine the quality of location chosen for restoration. The data set I  used to test the hot spot tool  is a point file of wetland mitigation sites (i.e. sites that have been restored or created based on intentional disturbance elsewhere).

The mitigation data look clustered when examined visually, and average nearest neighbor confirms this hypothesis.

It seems intuitive that wetlands would be clustered towards streams so I ran average nearest neighbor on the valley’s streams to examine spatial distribution. This showed that the streams are less clustered than mitigated wetlands, indicating there other factors that explain locations of mitigated wetland sites.

Categorical data is largely unusable in the spatial statistics toolbox. However, I wanted to examine the spatial distribution of mitigated wetlands compared to historic vegetation cover. In order to work around the categorical data, I first created a layer that only contained historic wetland vegetation; I then ran the “near” tool to calculate distance between the mitigated wetlands and the historic wetland polygons. Lastly, I ran the hot spot analysis on this distance.

Red indicates increased distance from a historic wetland. The results show that since most of the valley was once floodplain wetlands, most sites are situated on historic wetlands; an area near Portland, however, shows a hot spot of mitigated wetlands that are located further from historic wetland vegetation.

 

Continue reading

For my research, I use annual Landsat satellite images to view the Willamette River to examine disturbances and loss in the valley’s wetlands. I also have several other critical data sets including a LiDAR inundation raster based on a 2 year flood return interval and several shapefiles showing the location of mitigation wetlands. One of the spatial problems I’d like to investigate in this class involves relating the data sets to each other; one of the ecological questions I’m asking through my research involves investigating the spatial distribution of wetlands created and restored through mitigation versus those destroyed and disturbed. For example, do the two differ in their proximity to the river and its tributaries? Is one more clumped/distributed than the other?

Utilizing the spatial statistics toolbox, specifically regression and mapping trends/clusters, may help me answer some of these questions.

Some of my annual satellite imagery viewed in Tasseled Cap Index:

Regression analysis can help you dive deeper into the spatial relationships and the factors behind spatial patterns. At a slightly more advanced level, regression analysis can help you make predictions based on your data. The ArcGIS Resource Center has a very nice page called “Regression Analysis Basics” and gives users an introduction to both regression and the related tools available. It notes the different components of models such as dependent and independent variables and regression coefficients. One of my favorite components of the page is the table “Common regression problems, consequences, and solutions”.  This lists problems and links to solutions that could potentially help you make your regression model stronger. Even if your skill set is beyond the basics of regression analysis, this page is a good refresher and introduction to how Arc can aid in telling a story.

Another helpful page is titled “What they don’t tell you about regression analysis”. Whatever you are trying to model is likely a complex phenomenon (especially in this class) and may not have a simple set of answers. Models often need revision and Arc has created a step-by-step protocol for increasing the validity of your analysis and model; this page guides you through six questions/check-marks that you’ll want to pass before you can have confidence in your model.

In my data, for example, I have several layers that could potentially help me identify where wetlands lie within the valley; examples include elevation, hydrology (stream and flood inundation), vegetation, and soils. Often, GIS users simply stack these layers together and create polygons based on areas that contain all, or a majority of layers. This technique may be based in ecologically sound logic, but does not address the strength between layers or the degree to which one or more layers may influence (both positively and negatively) others.
A regression analysis using known areas of wetland as the dependent variable and a variety of GIS layers as explanatory variables could help me predict places where wetlands are located but may not have been mapped.  Or, even better, it could help me predict where wetlands were in the past. The two pages listed above are useful in guiding me through making a model through the individual decisions I need to make. For example, using Ordinary Least Squares versus Geographically Weighted Regression.

Take a look at the two introduction pages and consider if your data could be used in a regression analysis and if the tools available in the Spatial Statistics toolbox could be useful. You could even just bring three different variables (ex: hydro, soils, and elevation) to try out.
There are three resources to explore further if you’re interested in using your data to perform regression analysis:

  1. Lauren Scott’s presentation on regression analysis
  2. The seminar on regression analysis titled “Beyond Where: Using Regression Analysis to Explore Why
  3. The regression analysis tutorial (the same used in Scott et al.’s presentation) where you can “Learn how to build a properly specified OLS model and improve that model using GWR, interpret regression results and diagnostics, and potentially use the results of regression analysis to design targeted interventions”