As we are discovering, there are often things we want to do but ArcGIS is not able to do them.  Esri has created a Tool Gallery for people to share tools they have created when ArcGIS cannot do what they want.   If you are thinking about creating a tool to do something you need, it is worth checking here first so that you don’t have to re-create the wheel.

http://resources.arcgis.com/en/communities/analysis/

http://resources.arcgis.com/gallery/file/geoprocessing

 

 

I have worked on plotting the observed values of speed and turning angle for each bird versus the time of the day, to see if any of the patterns observed in the Incremental Autocorrelation plots can be traced back to relationships between the individual points. As far as I can see, there doesn’t seem to be none. I am attaching the output for four of my birds, including also an image of the area where they have been moving (where green is forest and pink is agricultural land).

(Note: The point plots correspond to a single day of observations, while the autocorrelation ones were made using all the observation days. I couldn’t run the analysis with the data from single days because they weren’t enough to meet the minimum required by the tool. )

606 606minimap

509 509minimap

626 626minimap

531 531minimap

495 495minimap

I am thinking that I should do the same type of plot using distance in the X axis rather than time, because there’s not a strict direct relationship between distance moved between two points and time taken to move that distance. Thus, a 30-second time interval between two points could either be reflecting 10 meters or 100 meters.

My new dilemma is that I am not sure what that distance on the X axis should represent. The distance of all points to an arbitrary point (e.g.: site of capture)? The distance along a movement path defined by joining consecutive points? Suggestions are welcome!

If you have shapefile or geodatabse feature class that you want to separate into several shapefiles or feature classes based on a specific attribute, you can do so relatively painlessly via XTools Pro.   XTools is an extension that should be loaded onto any OSU owned computer that also has ArcGIS (at least this is the case for all computers in Digital Earth).

Once you have XTools toolbar added to your map, you can find the ‘Split Layer by Attributes’ tool under ‘Feature Conversions’.  Caution: the tool requires the same input and output file types to work correctly (i.e., shapefile –> shapefiles or geodatabase feature class –> geodatabase feature classes).

There are many other useful tools worth exploring in XTools Pro (www.xtoolspro.com).

I am working with a humpback whale dataset collected across the North Pacific from 2004-2006.  Given the large spatial extent, I have selected a subset of data from the Gulf of Alaska (GOA) and would like to look for spatial patterns in the genetic diversity of the whales sighted in the GOA in relation to their environment.  Complicating this problem is the fact that most of the data was collected opportunistically, making the spatial distribution of whale sightings a better reflection of where researchers collected the data and not indicative of whether or not environmental variables influence  humpback whale habitat use.

Splash_All

Figure 1. North Pacific humpback whale sightings from SPLASH.  The data include > 18,000 photo-identification records and 2,700 DNA profiles for 8,000+ unique individuals.

SPLASH_GOA

Figure 2. A subset of the SPLASH data for the Northern and Western Gulf of Alaska. The data subset includes 2,622 records (both photo-identification and DNA profiles) for 1,448 unique individuals.

Ultimately, I need to figure out a method that will allow me to get beyond the uneven (non-systematic) sampling effort to determine if there is any sort of spatial pattern in the data based on genetics and environmental features (i.e. depth, slope, etc).  Two (among many) working hypotheses:

  1. Humpback whales are found in clusters at a particular depth  or slope range.
  2. Humpback whales that share the same haplotype (maternally inherited mitochondrial DNA) cluster together.

When analyzing data it is important to have a basic familiarity with the data structure.  With tabular data this often means creating histograms and scatter plots to visualize the structure and relationship between point values.  Also useful are knowing descriptive statistics such as minimum, maximum, mean, and standard deviation values.  Familiarity with spatial data should include measures of their geographic dispersion, autocorrelation, and value aggregation.  Within ArcGIS these characteristics can be measured using “Average Nearest Neighbor”, “Spatial Autocorrelation (Global Moran’s I)”, and “Hot Spot Analysis Getis-Ord Gi*)” tools, respectively.  In this example I look at the spatial structure of a sample of satellite image-mapped forest disturbances in Oregon’s west Cascades.  The data are polygons representing unique disturbance events, with attributes including: year of disturbance detection, magnitude of disturbance, and duration.

1.  Average nearest neighbor.

Magnitude of disturbance was divided into three classes (low, medium, and high).  Each class was run through the average nearest neighbor tool to determine if the spatial pattern is clustered, random, or dispersed.  The pattern for low magnitude disturbance is random, whereas medium and high are clustered.  This pattern of disturbance severity and its distribution is possibly a function of the disturbance agent.  Low magnitude disturbances are typically natural, which may be more random than anthropogenic disturbances, like clearcuts, which dominate the medium and high magnitude classes.  Note that nearest neighbor analysis is highly sensitive to the data extent.  A larger of smaller extent, would likely change the result, therefore the stated results are only meaningful for the area and extent used, not an indication of universal pattern.

2.  Spatial autocorrelation (Global Moran’s I)

Global Moran’s I was applied to disturbance magnitude (without classification based on severity).  Global Moran’s I indicated that the disturbances are clustered by magnitude.  This means that there is autocorrelation within data, where disturbances close to one another have similar magnitudes.  The results are the same as nearest neighbor evaluated by severity classes, except that magnitude was explicit in the analysis with Global Moran’s I (no classification needed).  The interpretation is the same as that for nearest neighbor.

3.  Hot spot analysis tool (Getis-Ord Gi*)

Getis- Ord Gi* calculates a z-score that relates to the clustering of either high or low valued features.  The results, based on the entire range of magnitudes, shows significant clustering of high values, but not of low values, which is consistent with nearest neighbor analysis.  The areas showing greatest significance of high magnitude clustering have relatively large gaps between neighbors, which could be a consequence of the “look-to-distance” of the analysis.

Picture1

 

 

An issue that most researchers tend to have is the problem of getting the data.    At times our data seems so close yet it is so far away. We as researchers often know what type of data we want and we may also know that it already exists.  However, we may not always know how to get the data.  Even more frustrating is finding the data that you need and realizing that it is not in a useable form.  Finding the correct data in a useable form has been my number one problem.  Thankfully a past student has come to my rescue.  She suggested using the National Historical Geographical Information System to access census data.  The NHGIS site provides, free of charge, aggregate census data and GIS-compatible boundary files for the United States between 1970 and 2011.  I intend to carry out a geographical approach to to understand and predict how the local spatial structure of new environmental amenities will influence and shape the way in which environmental justice communities will evolve.  This research aims to develop a novel framework/approach to understand the evolution of environmental justice communities in relation to the incorporation and management of natural amenities.  To achieve this objective I will complete several benchmark activities including:

Observe spatial and temporal variation and patterns of neighborhood characteristics (educational attainment, income, racial composition, household tenure, renters) over a 70-year period

  • There are many issues that will arise as I attempt to accomplish this task.  For instance, the temporal resolution of my data will be in 10-year increments, this may not entirely capture the patterns that I will be looking for.
  •  Assessing variables temporally will prove to be difficult.  For example, educational attainment is a variable that is not available in all years of the census data.
  •  I will also consider how the census tracts and census blocks change over time which could

Quantitatively assess the spatial and temporal variation and patterns of natural amenities over a 70 year period, using satellite imagery and aerial photography.

  •  There is a lot of uncertainty that is associated with using aerial photography and satellite imagery.
  •  One that I considered using to look at green space in an area is to calculate NDVI, which is the Normalized Difference Vegetation Index.  In short, it is a remote sensing technique to assess whether the target being observed contains live green vegetation or not
  • Another technique I am considering is to use an unsupervised k-means classification to explore and assess the change from open/greenspace to impervious surface.

There are a number of things that I still need to consider when trying to carry out this project but, this is a start.  My plans for the next week is to continue to explore my data and run some tools that will help to better describe the distribution of certain neighborhood characteristics.

 

 

 

 

The following screenshots are the results that I have generated using Hot Spot Analysis, Anselin Moran’s and Global Moran’s I to investigate the clustering of soils with high clay content in the six sub-AVAs (Chehalem Mountains, Ribbon Ridge, Dundee Hills, Yamhill-Carlton, McMinnville, and Eola-Amity Hills) of the northern Willamette Valley. I have created quite a few data sets, and am in the process of identifying useful methods for further interogation of my data. Along those lines, I need some feedback regarding the interpretation of these results – any comments would be greatly appreciated.

Percent_clay_Location_Map_of_the_entire_Willamette_Valley_AVA

Percent clay Location Map of the entire Willamette Valley AVA

Percent_clay_of_the_entire_Willamette_Valley_AVA

Percent clay of the entire Willamette Valley AVA (including the six sub-AVAs in the northern portion of the Willamette Valley)

Percent_clay_DETAIL

Percent Clay detail of the northern Willamette Valley

Hot_Spot_clay_ZScore

Hot Spot Analysis (GiZScore) of Percent Clay; detailed

Hot_Spot_clay_PValue

Hot Spot Analysis (GiPValue) of Percent Clay; detailed

Anselin_Morans_clay_cluster_outlier_type

Anselin Moran’s (Cluster/Outlier Type) of Percent Clay; detailed

Anselin_Morans_clay_ZScore

Anselin Moran’s (LMiZScore) of Percent Clay; detailed

anselin_morans_PValue

Anselin Moran’s (LMiPValue) of Percent Clay; detailed

global_morans_I_clay_1000   global_morans_I_clay_5000

global_morans_I_clay_10000global_morans_I_clay_15000

Global Moran’s I using a fixed distance of 1,000 meters, 5,000 meters, 10,000 meters, and 15,000 meters

In the geological sciences spatial statistical analysis of gas distribution and migration thru subsurface systems has been applied in a limited number of studies across a variety of systems, CO2 storage, hydrocarbon exploration, landfills, and natural gas storage facilities on a fairly limited basis.  My study seeks to evaluate the source and mechanism of potential contaminates in groundwater systems affiliated with engineered-subsurface resource activities (e.g. hydrocarbon development, CO2 storage, EGS stimulation) using currently available datasets.

Specifically, this project seeks to apply geostatistical techniques in combination with spatial analysis of key datasets from a single geologic basin to evaluate the source and mechanism of gas and other potential contaminants in groundwater systems.  This project hypothesizes that larger-scale patterns in shallow methane concentrations in groundwater aquifers can be correlated to both primary migration pathways (such as wellbores or fracture networks) and the underlying volume of in situ hydrocarbon.  The general approach to this study is to identify, standardize, and integrate preexisting data from the study basin for use in geostatistical, relational, and probabilistic evaluation and interpretation.

The box model diagram below conceptually simplifies the primary systems interacting in the subsurface.  Datasets key to characterizing the flux of gas in and out of these systems, i) sources, ii) pathways, and iii) receptors, will require spatial characterization and statistical analysis in order to support predictions of areas of likely high-flux to receptors versus low-flux to receptors in relation to both natural and anthropogenic processes.

simplified box model 4 2013

My objective was to see if the displacement of the birds showed particular patterns. For this, I decided to analyze the distribution of speed and rotation angles in space. Speed at a particular point is calculated as distance to previous point over time taken to move between points. Rotation angle refers to the angle between two consecutive movement lines (i.e., lines joining point A to B and B to C).
I first tried the Spatial Autocorrelation function, which indicated a clustered distribution of the values.

Example of output of the Spatial Autocorrelation tool applied to rotation angles.

These results weren’t meaningful for me though, as I was interested in the variability within the observations. Studies on different animals species have shown that the analysis of variability within movement patterns can be used to infer behavioral patterns. I expected the birds would show varying speeds and rotation angles in response to the habitat where they were living (e.g., move slower inside the forest and quicker between forest patches; straighter movement lines in non-forest habitat). Thus, I decided to apply the Incremental Spatial Autocorrelation function, as this tool would indicate if the spatial clustering of values varied in the study area.

The results show mixed responses from each bird, with no clear interpretation for the observed patterns.

Example of output of the Incremental Spatial Autocorrelation tool applied to speed.
Example of output of the Incremental Spatial Autocorrelation tool applied to rotation angles.

Most of them have non-significant z-scores, and those that do have no clear relationship to any environmental factor.  Hot spot analyses don’t show a particular concentration of values at any point either.

Example of output of the Hot Spots tool as applied to rotation angles.

 

In conclusion, speed and rotation angles are either A) not affected by the disposition of forest or B) bad indicators of behavioral changes associated to space use.