For my research, I use annual Landsat satellite images to view the Willamette River to examine disturbances and loss in the valley’s wetlands. I also have several other critical data sets including a LiDAR inundation raster based on a 2 year flood return interval and several shapefiles showing the location of mitigation wetlands. One of the spatial problems I’d like to investigate in this class involves relating the data sets to each other; one of the ecological questions I’m asking through my research involves investigating the spatial distribution of wetlands created and restored through mitigation versus those destroyed and disturbed. For example, do the two differ in their proximity to the river and its tributaries? Is one more clumped/distributed than the other?

Utilizing the spatial statistics toolbox, specifically regression and mapping trends/clusters, may help me answer some of these questions.

Some of my annual satellite imagery viewed in Tasseled Cap Index:

My spatial problem:

Under what forest and climatic conditions do endemic mountain pine beetle populations switch to epidemic populations?

The null hypothesis is that conditions are random and alternative hypotheses include: 1) population eruptions are simply cyclic or periodic; 2) there are specific environmental condition triggers; 3) some combination of alternative hypotheses 1 and 2.

Mountain pine beetle survival is dependent on availability of susceptible hosts and suitable temperature range, with the primary limitation being minimum temperature.  In an endemic state, mountain pine beetles may kill several trees in a dispersed pattern, while in an epidemic state, nearly continuous, widespread host tree mortality is observed.  Population eruptions exhibit both temporal and spatial patterns over the landscape making spatial statistics a useful analysis tool.

There are two parts to the study: 1) outbreak detection and monitoring using 40 years of Landsat satellite imagery and 2) analysis of relationships between outbreak initiation and spread and forest and climate conditions

Independent variables include:

Host availability at time of outbreak:

Host density

Host age

Topography:

Elevation

Slope

Aspect

Climate at and prior to time of outbreak:

Min and max temperature for various time periods

Precipitation for various time periods

Forest structure:

Composition

Management

Disturbance history

Each of these variables could potentially be related to outbreak timing and position through geographically weighted regression.

Coastal salt marshes are at great risk from a large number of factors, especially climate change and sea-level rise.  In an effort with the USGS, I’m working to determine how different salt marshes along the Pacific Coast will response to changes in sea level.  Part of our approach is collecting fine-scale, baseline field data in the form of RTK GPS elevation points and vegetation surveys.  Through analysis of data from a range of sites (up to 15 along the coast), I hope to better characterize plant  habitat requirements with an ultimate goal of producing improved community response projections under sea-level rise scenarios.  In this class, and in Jim Graham’s Spatial Modeling/Big Data class, I will be working with the elevation & veg data to characterize spatial relationships of plant species against plot-level factors (inundation frequency, distance to channel, elevation) and site factors (temperature, salinity, tidal range).  I have hundreds of vegetation plots per site, with about 2000 survey plots completed across our PNW sites.

Currently, I’m still in data processing mode, combining databases and gathering environmental data; field data collected wrapped up in January. However, the inundation data needs to be developed, first by kriging the elevation data into DEMs and then using site-specific waterlogger data to determine flooding frequency. The water logger data itself needs to be processed for barometric pressure and elevation. Marsh channels need to be digitized before a distance to channel raster can be created. There’s a lot of work still be done to get the data in shape for analysis, however by focusing on one or two sites, I’ll be able to explore the spatial statistics toolbox and push forward with this project.

I am working with hummingbird location points obtained through radiotelemetry, and want to figure out their patterns of space use and how they are affected by forest fragmentation.

I need to find ways of assessing which areas are preferred by the birds as well as the movement patterns they follow.

The points were recorded within a short time period, so they are not independent. Autocorrelation functions will help me evaluate the degree of this dependence in space and time. Rather than a problem, the autocorrelated nature of my data presents an opportunity to study activity patterns of the birds.

My research seeks to quantify and explain patterns of variability as they relate to specific soil properties (such as nutrients, physical structure, ect.) There are patterns in the data itself (distribution shapes such as normal or bimodal, skewness, and variance) and in the spatial distribution of those data values.

I wish to learn more about tools that can characterize these data distributions and spatial patterns (emphasis on spatial for this class). It is especially a challenge because soil variables often don’t follow well-known distribution such as Gaussian or Exponential. This leaves me wary about the use of certain mathematical tools that requires assumptions such as a normal distribution. The central limit theorem does not apply when we move beyond questions about the mean.

At this point I do not have a specific question in mind, and I should also mention I haven’t collected any data for this project yet (I’m just starting lab analysis this spring). I do not have a specific spatial question, rather I’d to learn about various classification and interpolation methods.

Some background info on soil for the curios

A blurb from the Soil Science Society of America on the importance of soil:

Soil provides ecosystem services critical for life; soil acts as a water filter and a growing medium; provides habitat for billions of organisms, contributing to biodiversity; and supplies most of the antibiotics used to fight diseases. Humans use soil as a holding facility for solid waste, filter for wastewater, and foundation for our cities and towns. Finally, soil is the basis of our nation’s agroecosystems which provide us with feed, fiber, food and fuel.

On the source of soil variability:

Soil is HIGHLY heterogeneous. It is a mix of weathered rock minerals, plant organic matter, liquid, and gas. It’s been forming for thousands of years. A multitude of environmental variables affect that formation at spatial scales from nanometer bacterial interaction to varying climate across landscapes. The real challenge is that variability increases as a function of spatial area under consideration. The variability of a 0.5m X 0.5m plot is different than that of a 5m x 5m and is different than a 50m x 50m plot and so on.

A graphical look at shifting soil scale and methods of characterizing variability

http://ars.els-cdn.com/content/image/1-s2.0-S0065211304850016-gr8.jpg

 

 

My data consists of points derived from a GPS track log, which contains spatial information for GPS points taken at 30-second intervals along with a time stamp for each point. I also have a spreadsheet of field data containing location information for the start and end of an encounter with a species of cetaceans, the time the encounter started and when it ended and other important information such as the species, the number of animals, etc. In order to pair species’ encounters with the GPS tracklog, I use the time information of the encounter and associate those points in the tracklog that correspond with the beginning and ending times.

 

I am interested in a couple of spatial aspects of this data that are pertinent to this class:

  1. Patterns in the environmental and oceanographic characteristics of the encounter locations that may explain melon-headed whale (Peponocephala electra) utilization of these locations.
  2. The spatial distribution of melon-headed whales and other small cetaceans and the patterns in the presence or absence of melon-headed whales and the presence or absence of other species.

 

These areas of interest bring up the following spatial statistics related questions:

  1. Do environmental and oceanographic characteristics differ significantly between locations?
  2. Which variables are significant predictors of melon-headed whale utilization of these areas?
  3. Do encounter locations differ significantly from locations where melon-headed whales were not seen?
  4. Is there a relationship between the presence (or absence) of melon-headed whales and the presence of other species of small cetaceans?

 

I am sure there will be more questions that present themselves once I begin delving into the data.

Today, I solved my first problem (thanks Jen!) and successfully projected my data so that I could begin running spatial statistic analyses. My data went from GCS_WGS_1984 (unprojected) to NAD_1983_StatePlane_Alaska_1_FIPS_5001 which will allow for improved accuracy in spatial calculations for whale sighting data in southeastern Alaska. I ran an Average Nearest Neighbor analysis on humpback whales sightings in southeastern Alaska and found that the observed nearest neighbor distance was significantly smaller than the expected value. This significant difference is most likely due to the complex geography of southeastern Alaska which creates a clustering of individuals. I also learned that results of my spatial statistics analyses will be presented in meters. I look forward to running additional analyses next week!

My spatial problem is that my data were not collected randomly and field efforts were influenced by predicted habitat use or confirmed sightings of whales. Thus, what appear to be hot spots or patterns of habitat use within southeastern Alaska, might actually be areas of increased field effort. This will undoubtedly complicate my analyses and I continue to turn to the Arc Blog (and Dori) for answers.

We have talked about creating a random sample of whales in southeastern Alaska and comparing their patterns of habitat “use” to what we actually have in our data. Stay tuned for more on that…

The data I will be using for this class are water quality data that I have been collecting from 21 locations in the upper Willamette River Basin. The water quality parameters that I will use for this class are dissolved organic carbon and nitrate.

I would like to learn 1) what the ArcGIS add-in toolboxes of SSN & STARS and FLoWS can do, 2) the concepts of each function offered by those toolboxes, and 3) run them on my data to answer one of my questions: are my water quality data varying spatially?

As I was exploring the Spatial Statistics Resources web-page, I quickly realized most of the spatial statistical tools offered by ESRI are not applicable to my project. My project explores spatial and temporal variations of water quality (dissolved organic carbon sources to be precise) in rivers of the Willamette River Basin. Those ESRI spatial statistical tools are not applicable to my project because 1) points are not representing actual observation points of organisms or diseases for my project but rather representing water quality sampling locations that were selected by me and 2) not only Euclidean distance but also in-stream distances, flow directions, and stream networks affect statistical significance.

I found add-in toolboxes for SSN & STARS and FLoWS that address those two issues mentioned above. These toolboxes were developed by the U.S. Forest Service (USFS). Unfortunately the currently available toolboxes are for ArcGIS 9.3, but the USFS states they are planning to publish new toolboxes for ArcGIS10 later this year.

 

http://webcache.googleusercontent.com/search?q=cache:5SIzWb38eREJ:blogs.esri.com/esri/arcgis/2013/01/29/ssn-stars-tools-for-spatial-statistical-modeling-on-stream-networks/+spatial+statistics+arcgis+water&cd=1&hl=en&ct=clnk&gl=us

 

Things I would like to accomplish by the next class period are to 1) download those two toolboxes and 2) see if they seem to work with ArcGIS10. Note, I am not planning on publishing data modified using those toolboxes developed for ArcGIS 9.3; however, these goals will help me explore what kinds of tools are available through these toolboxes and learn the concept of tools that I am interested in using.

As a general introduction to what I can expect from spatial statistics I searched for a webpage that would define what spatial statistics are, what kinds of questions they can answer, and how they are different from a-spatial statistics.  I found a document entitled “Understanding Spatial Statistics in ArcGIS 9” (http://www.utsa.edu/lrsg/Teaching/EES6513/ESRI_ws_SpatialStatsSlides.pdf) that answers these questions.

The document begins by answering the question “What are spatial statistics?”  The author defines them as “exploratory tools that help you measure spatial processes, spatial distributions, and spatial relationships.”

There are two categories of spatial measurements:

1)      Identifying characteristics of a distribution.  This first category of measurements is descriptive, answers questions like: where is the center, or how are the features distributed around the center?

2)      Quantifying geographic pattern ie are the data random, clustered, or evenly dispersed.

Spatial statistics are different from a-spatial or non-spatial statistics in that spatial statistics include some measure of space in there mathematics.  In most cases, neighboring observations are considered in the statistics regarding a focal observation or global measurement.

The document describes a few examples of problems or questions addressed using spatial statistics available in ArcGIS:

1) How does the distribution of Dengue Fever for a village in India change during the first three weeks after the outbreak?

2) Does bobcat movement between preferred habitat areas coincide with natural land features such as valleys, rivers, or ridgelines?

3) Are there persistent areas in the United States where people are either dying earlier, or living longer, than the average American?