For my “first take” on the Spatial Statistics Resources blog, I learned more about the mathematical statistics contained within the tools of the Spatial Statistics toolbox. I quickly realized that the tools can be grouped by common mathematical principle. For example, all hot spot identification is found using something called the Getis-Ord Gi* statistic. Looking at the Desktop 10 Help website list of sample applications, most tools are listed with an associated mathematical statistic (usually listed in parentheses). For example:

Question: Is the data spatially correlated?

Tool: Spatial Autocorrelation (Global Moran’s I)

Some of the mathematical concepts I am fairly well acquainted with, like ordinary least squares. Others I had never heard of. The Getis-Ord statistic is one I’d never encountered before. I used one of my primary research tools, the internet, and found the statistic was developed in the mid-nineties by the method’s namesake statisticians.

Link to the 1995 paper on the Getis-Ord statistic

But one need not always consult the internet at large. ESRI provides some explanation of each tool in various articles scattered around the Spatial Statistics folder from Desktop 10.0 Help. I’ve begun assembling a list with the link to each math principle/tool/statistics below. I would like to learn about these statistics, what their strengths and weaknesses are, and especially when it is not appropriate to use them (what are the assumptions?).

List of Mathematical Principles/Statistics Underlying the Suite of Available Spatial Statistics

Analyzing Patterns:

How Multi-Distance Spatial Cluster Analysis (Ripley’s K-function) works

How Spatial Autocorrelation (Global Moran’s I) works

How High/Low Clustering (Getis-Ord General G) works

Mapping Clusters:

How Hot Spot Analysis (Getis-Ord Gi*) works

How Cluster and Outlier Analysis (Anselin Local Moran’s I) works

Measuring Geographic Distributions:

How Directional Distribution (Standard Deviational Ellipse) works

Modeling Spatial Relationships:

Geographically Weighted Regression (GWR) (Spatial Statistics)

Ordinary Least Squares (OLS) (Spatial Statistics)

 

The class today discussed topics of interest within the ArcGIS Spatial Statistics toolbox using the Spatial Statistics Blog as a starting point (http://blogs.esri.com/esri/arcgis/2010/07/13/spatial-statistics-resources/).   Most students looked for concepts or tools that would be useful to their specific research needs.  For me, I was interested in the discussion surrounding modeling spatial relationships and analyzing patterns and how this might apply to the humpback whale data I am using for my own project.

Of particular interest was the “Conceptualization of Spatial Relationships” (http://help.arcgis.com/en/arcgisdesktop/10.0/help/#/Modeling_spatial_relationships/005p00000005000000/) webpage.  This concept is important for most of the tools used in the Spatial Stats toolbox and is critical for data in which there is some degree of locational uncertainty – what is the best spatial conceptualization for your data so that the tool output makes sense with your data?

Other interesting points made in class today include:

The discussion on regression and measuring geographic distributions.

TOOL: Generate Network Spatial Weights

URL: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//005p0000001z000000

Toolset: Modeling Spatial Relationships

Summary: This tool allows the analysis of the spatial relationship between features whose connections are restricted to a network. This means that the movement between two points can only take place through specific routes. Consequently, if one wants to analyze the shortest distance between two points, the Euclidean (straight-line) might not be the appropriate measurement.

The Generate Network Spatial Weights generates a spatial weight matrix which quantifies the relationship between features based on their neighboring relationships and under the restriction of a network dataset.

 

TOOL: Linear Directional Mean

URL: http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//005p00000017000000

Toolset: Measuring Geographic Distributions

Summary: This tool measures the trend of a set of lines to identify their mean direction, length, and geographic center. It can calculate mean direction and/or mean orientation. In the first case, the start and end points of the lines matter; in the second, they don’t.

The output of the Linear Directional Mean is a single line centered on the calculated mean center, length equal to the mean length and direction (or orientation) equal to the mean direction (or mean orientation) all input vectors.

 

VIDEO: Performing Proper Density Analysis

http://video.arcgis.com/watch/401/performing-proper-density-analysis

Duration: 12:11min

Summary: The purpose of this video is to explain importance of user-decisions (such as input parameters) when performing a density analysis and generate awareness of the existence of subjective aspects of the results.

Since one of my interest in this class is focused on understanding the underlying principles behind ArcGIS’s spatial statistics tools, I was interested in this link (http://blogs.esri.com/esri/arcgis/2010/04/07/check-out-our-chapter-on-spatial-statistics-in-arcgis-in-the-handbook-of-applied-spatial-analysis/) to additional materials related to spatial statistics. The chapter published by Lauren Scott and Mark Janikas with the Handbook of Applied Spatial Analysis, edited by Manfred M. Fischer and Arthur Getis, relates directly to the tools in the Spatial Statistics toolbox provided by Esri within ArcGIS. I’m interested in trying to track down a copy of this handbook and seeing what topics are discussed, see how others are using spatial statistics, as well as learning more about the underlying principles and ideas behind the spatial statistics used. I will send Julia Jones the information about the book so she can request the book through the Valley Library.

In my initial exploration of the ESRI spatial statistics website, I focused on tools that might be useful in my proposed research of population structure and behavioral ecology of humpback whales (Megaptera novaeangliae) in Glacier Bay/Icy Strait, Alaska. One objective of my master’s thesis is investigating the mechanisms of population increase within Glacier Bay/Icy Strait, Alaska since the early 1970s/1980s. I was initially struck by the hot spot analysis, thinking it might be informative to visualize habitat use of humpback whales within Glacier Bay/Icy Strait. This region has undergone massive geological change in the past decades and has become deglaciated relatively recently, i.e. over the past 200 years. Visualizing the habitat use (depth, slope, distance from shore, etc.) of the contemporary population of humpback whales in Glacier Bay/Icy Strait might help inform why there has been an increase in abundance in this region. This would be done by importing layers of oceanographic features under humpback whale encounters to detect patterns of habitat use.

Links:

How to do it:

http://resources.arcgis.com/gallery/file/geoprocessing/details?entryID=604B4BD9-1422-2418-A0F3-77076337D488

http://www.arcgis.com/home/item.html?id=dea008bcc77d4fd485abdf8121190b82

How it works:

http://help.arcgis.com/en/arcgisdesktop/10.0/help/#/How_Hot_Spot_Analysis_Getis_Ord_Gi_works/005p00000011000000/

TO DO: After visualizing my humpback whale encounters in ArcGIS, it occurred to me that what appear to be hot spots within Glacier Bay/Icy Strait, might actually be areas of increased field effort. My data was not collected using random transect lines and thus, this is going to complicate any potential hotspot analysis.

The class reviewed the content of ESRI’s ArcGIS spatial statistics blog and reported on areas of interest and potential future use from each student’s perspective.  With regards to statistical predictions involving three dimensional problems spanning the subsurface, groundwater, surface and atmospheric systems, one challenge is how to use ArcGIS statistics to evaluate the connectivity and interaction of these systems to predict or estimate relationships between them.

ArcMarine has a 3-D component but is still under development and does not directly address the issues of spatial statistical analyses of 3D systems.

Jen’s identification of the spatial statistics in ArcGIS handbook was interesting and may be useful for identifying tools and analyses that are appropriate.

Peggy’s identification of the externally developed tool for statistically evaluating flow through networks (rivers and streams) may also be useful.  See her post for link to this tool.

Dori’s discussion on identification of generating network spatial weights is also relevant to our approach and something Jen and I have utilized previously for our research.

Finally, evaluating further the tools available in the “Assess Overall Spatial Patterns” and the “Model Relationships” portion of the ArcGIS blog also look prospective and worthy of further investigation.

The two posts I found that I may benefit the most from are:

  1. Supplemental Spatial Statistics Toolbox: http://www.arcgis.com/home/item.html?id=694e0f97355740d7bba6b8b356c0b925

The tools for integrated spatial autocorrelation and exploratory regression analysis seem like they would be useful for investigating spatial relationships and identifying important response variables for spatial models.

  1. Integrating R and ArcGIS:

http://www.arcgis.com/home/item.html?id=a5736544d97a4544aa47d06baf910f6d

I’ve spent much more time in R running spatial models than in Arc, having to bring model outputs into Arc for mapping after the analysis is complete.  For more complex models, this is probably still the most efficient method, but for simpler analysis it may be easier to run the analysis and produce maps in Arc.

I also found the regression analysis pages very useful as a reference, in addition to the page on ‘Finding a Meaningful Model’ http://www.esri.com/news/arcuser/0111/findmodel.html .  The tutorials for hot spot, regression analysis, and model builder seem like they would be worthwhile to run through and of general benefit to others in the class.

Kevin Buffington