Tag Archives: spatial autocorrelation

Final

Question

How are the spatial patterns of individuals’ perception of natural resource governance related to the spatial patterns of environmental restoration sites via distance and abundance of improved sites?

Datasets

Puget Sound Partnership Social Data

My data came from a survey on subjective human wellbeing related to natural environments. The sample was a stratified random sample (28% response, n= 2323) of the general public from the Puget Sound in Washington from one time period. I am specifically examining questions related to perceptions of natural resource governance. Around 1730 individuals gave location data (cross street and zip code) which have been converted to GPS points. I also have demographic information for individuals, their overall life satisfaction, and how attached and connected they feel to the environment.

Environmental Data

I have three different forms of environmental data. (1) Point locations of restoration sites in the Puget Sound. (2) A shapefile of census tract level average environmental effects (the environmental condition ranked from 1 good to 10 very poor, and (3) Land cover raster files from 1992 and 2016.

Individual restoration sites numbered over 12,000. They spanned many years, and point locations overlapped significantly through time.
The environmental effects data comes from an online mapping tool that was created in a collaboration between the University of Washington, the Washington Department of Ecology, and the Washington Department of Health. Environmental effects are based on lead risk from housing, proximity to hazardous waste treatment storage and disposal facilities (TSDSs), proximity to national priorities list facilities (Superfund Sites). Proximity to Risk Management Plan (RMP) facilities, and wastewater discharge. The combination of these effects was aggregated within census tracks to produce an environmental effects ‘Rank’ from low effects (Rank of 1) to high effects (Rank of 10).
The land cover files contained 25 different types of land cover. I reclassified these land cover types by combining similar types of land cover (e.g. deciduous forest and conifer forest to ‘forest’).

Hypotheses

1) Shorter distances between individuals and restoration sites, and greater number of restoration sites near individuals, would correlate positively with governance perceptions

2) Positive environmental conditions will correlate positively with governance perceptions

Richard Petty and John Cacioppo developed the elaboration likelihood model (1980) which asserts how individuals change their attitude based on persuasive stimuli. Perry and Cacioppo develop this idea by implementing the idea of two routes of persuasion—the central and peripheral paths—where determinants of routes are determined by motivation, ability, and nature of mental processing. The purpose of this theory is to help explain how individuals elaborate on ideas to form attitudes and how strong, or long lasting, those attitudes are. This theory may suggest that individuals will form stronger attitudes when reasons to form attitudes are stronger and more immediate. Stimuli of this kind tend to be nearer to individuals, as processing of issues far away is cognitively difficult. Research has shown that individuals have a difficult time thinking about problems on larger spatial scales, and there is an inverse relationship between how feelings of individual responsibility for environmental problems and spatial scale (as scale gets larger, feelings of responsibility go down (Uzzell, 2000).

Cognitive biases also suggest that people use heuristics to answer questions when they are unsure. One common heuristic is the saliency bias, where individuals overemphasize things that are emotionally more important, and ignore less interesting things (Kahneman, 2011). This may suggest that spatial patterns are more important than temporal patterns because what is happening now is more salient to individuals than what happening in the past. This may imply that even if environmental outcomes have improved over time, the immediacy of the current environment or things influencing the environment may have a greater effect on individual perceptions.

Approaches

Statistical Analyses

Spatial Autocorrelation Analyses

Moran’s I: To compute Moran’s I, I used the “ape” library in R. I also subset my data to examine spatial autocorrelation by demographics including area (urban, suburban, rural), political ideology, life satisfaction, income, and cluster (created by running a cluster analysis on the seven variables which comprise the governance index

Correlograms: I created correlograms for the variables that were significant (urban, conservative, and liberal) using the “ncf” library and the correlog() function. These figures give a better picture of spatial autocorrelation at various distances.

Semivariograms: To create semivariograms, I used the “gstat” and “lattice” libraries which contain a function called variogram. This function takes the variable of interest along with latitude and longitude locations. The object created can then be plotted. For this analysis I used the same subsets as in the Moran’s I analysis. These figures are excluded from the results as they do not provide information beyond what the correlograms show.

Nearest Neighbor analysis: A file of restoration sites was loaded into R. The sites were points. This analysis required four libraries as the points were from different files. The libraries were: nlem, rpart, spatstat, and sf. I first had to turn my points into point objects in R (.ppp objects). I used the convenxhull.xy() function to create the ‘window’ for my point object, then I used that to run the ppp() function. I then used the nncross() function. This function produced min, max, average, and quartile distances from individuals to the nearest restoration site. I added the ‘distance’ from the nearest neighbor as a variable in a linear regression to determine an association. I used these distances (1^st quartile, median, and 3^rd quartile) to produce buffers. I ran spatial joins between the buffers and the restoration sites. This produced an attribute table that had a join count—the number of restoration sites within the buffer. In R I ran a linear regression with join count as an independent variable.

Geographically weighted regression: I joined individual points to my rank data, a shapfile at census tract level. This gave individuals a rank for environmental effects at their location. Initially I used the GWR tool in Arc to run the regression, but wanted the flexibility of changing parameters and an easily readable output that R provides. In R, I used two different functions to run GWR, gwr(), and gwr.basic(). The gwr() required creating a band using gwr.sel, and the gwr.basic required creating a band using bw.gwr. The difference between these functions is that gwr.basic produces p-values for the betas. I ran gwr on both my entire data set and a subset based on perceptions. The subset was the ‘most agreeable’ and ‘least agreeable’ individuals who I defined as those one standard deviation above and below the mean perception.

Neighborhood analysis: I created a dataset of the upper and lower values of my governance perceptions (one standard deviation above and below the means). I then added buffers to these points at 1 and 5 miles. I joined the buffers to the rank values to get an average rank within those buffers. I exported the files for each buffer to R. I created a density plot of average rank for the low governance values at each buffer, and for the high governance values at each buffer.

T-test: To test whether land cover change differed between those with high and low perceptions, I exported the tables and calculated the change in land cover for the two samples. I then ran two-sample t-tests for each land cover type change between the two groups.

Mapping

Kriging and IDW: I used the Spatial Analysis toolbox to preform Kriging and IDW to compare the outputs of the two techniques. I used my indexed variable of governance perceptions. The values of the variable vary from 1 to 7. I then also uploaded a shapefile bounding the sample area.

Hotspot Analysis: I used the Spatial Analysis toolbox to preform ‘hotspot analysis.’ I used my indexed variable of governance perceptions with values from 1 to 7.

Tabulate area: I used two land cover maps from 1992 and 2016 to approximate environmental condition. I loaded the raster layers into ArcGIS Pro, and reclassified the land types (of which there were 255, but only 25 present in the region) down to 14. I then created buffers of 5km for my points of individuals with governance scores one standard deviation above and below the average. I then tabulated the land cover area within the buffers for each time period.

Results

What are the spatial patterns of perceptions of natural resource governance?

Moran’s I statistic for overall governance perceptions was insignificant, indicating a lack of spatial autocorrelation at the regional scale (I value; p = 0.51). Significant autocorrelation was detected when the data were subset by demographic variables. Individuals residing in urban areas exhibited spatial autocorrelation of perceptions while individuals in suburban and rural areas did not. Lastly, individuals classified as ‘conservative’ (political ideology < 0.5) exhibited significant spatial autocorrelation for governance perceptions (I statistic: -0.0062, p = 0.0018). The spatial autocorrelation for governance perceptions was mainly restricted to short distances as shown in the correlograms below. Significant correlations for the entire study population and individuals in urban areas were detected at near distances Individuals classified as conservative exhibited significant correlation across multiple distances. No significant correlations were found among liberal individuals, suggesting a non-spatial driver mechanism may be at play.

Local patches of significantly lower perceptions (cold spots) and significantly higher perceptions (hot spots) were identified as shown in the map below. Three ‘cold spots’ were located in the areas around Port Angeles, Shelton, and the greater Everett area. Two hot spots were identified surrounding Bainbridge Island, and the city of Tacoma, which is where the Puget Sound Partnership is located.

Blue points are “cold spots” at 99%, 95%, and 90% confidence that the points reside in a cold spot. Cold spots are areas where perception is significantly lower than the average compared to those around them. Red points are “hot spots” at 99%, 95%, and 90% confidence that the points resides in a hot spot. Hot spots are areas where perception is significantly higher than the average compared to those around them. Small grey points are insignificantly different. Bounding lines are counties in Northwest Washington.

What are the spatial patterns of natural resource governance perceptions in relation to environmental condition?

The median distance for individuals from restoration site was 0.037 degrees, 1^st quartile was 0.020 degrees, and 3^rd quartile was 0.057 degrees.

The regression on whether distance correlates with individual perception was insignificant (p = 0.198), which means distance from the nearest restoration site does not influence perceptions. For each regression on the number of sites near an individual, all coefficients were negative, meaning the more sites near an individual, the more disagreeable their perception was. All produced significant results, but the effect size of number of sites near individuals was very minimal (Table 1).

Table 1. Regression results for ‘nearest neighbor’ of individuals to restoration sites.

Buffer Size	b	p-value	Effect Size (r_pb)

Buffer 1 (1^st quartile)	-0.003	0.0181	.003
Buffer 2 (median)	-0.002	0.045	.002
Buffer 3 (3^rd quartile)	-0.001	0.002	.003

In a geographically weighted regression (GWR) model, rank, life satisfaction, years lived in the Puget Sound, and race are significant (Table 2). All other variables were insignificant. For rank (the combination score of environmental effects) the coefficient was positive. Higher rank values are worse environmental effects, so as agreeable perceptions increase, environmental condition decreases. Overall, the effect size of this model on governance perceptions was small, explaining about 10% of the variance in the data (Table 2).

A map of residuals from the GWR confirms the results from the hotspot analysis about where high and low responses congregate.

Table 2. Regression results for environmental effects at individuals’ locations.

Variable¹	b	p-value²

Rank	0.077	0.007**
Life Satisfaction	0.478	<0.001**
Years Lived in the Puget Sound	-0.010	0.002**
Sex	-0.156	0.289
Area	-0.150	0.179
Education	-0.002	0.922
Income	-0.022	0.622
Race	0.006	0.034*
Ideology	0.103	0.533

¹R² = 0.094

²** = significant at the 0.01 level, * = significant at the 0.05 level

The plot of high and low governance values at the two buffers is presented below. The black and grey curves represent respondents from the survey that were at least one standard deviation lower than the mean (the mean was neutral). The green curves represent respondents from the survey that were at least one standard deviation higher than the mean. The solid curves are the average rank at the one mile buffer, and the dotted curve is the average rank at the five mile buffer.

This figure indicates there are two peaks in average rank, at low environmental effects (~1) and at mid-environmental effects (~4.75). Those with lower perceptions of environmental governance had higher peaks at low environmental effects for each buffer size. Those with higher perceptions of environmental governance had a bimodal distribution with peaks at low and mid-environmental effects.

What are the spatial patterns of natural resource governance perceptions in relation to environmental condition over time?

I used land cover change as a proxy for environmental condition over time. Land cover change differed significantly for four different land cover types between the two groups—grassland, forest, shrub/scrub, and bare land cover. Grassland cover decreased for both groups, but decreased by 5% more in the areas with individuals with below average governance perceptions. Forest cover also decreased for both group, but decrease by 1% more individuals with below average governance perceptions (the amount of forest cover in the region is very large which accounts for why a 1% difference is significant). Shrub and scrub cover increased for both groups, but increased by 3% more in areas with individuals with below average governance perceptions. Lastly, bare land decreased for both, but decreased by 5% more for individuals in areas with individuals with below average governance perceptions (Table 3). The effect size of land cover change on perception was small for each of the significant variables with biserial correlations between .07 and .09 (Table 1).

Table 3. Differences in land cover change between individuals with above average perceptions of natural resource governance perceptions and individuals with below average governance perceptions

	Governance Perception¹
	Above Average	Below Average	t-value	p-value	Effect size (r_pb)
High Development	17	16	0.80	.426	.02
Medium Development	19	19	0.67	.500	.02
Low Development	11	11	0.22	.829	.00
Developed Open Space	12	13	0.21	.837	.00
Agriculture	-1	0	0.74	.461	.02
Grassland	-13	-18	3.03	.002	.09
Forest	-10	-9	2.61	.009	.08
Shrub/Scrub	40	43	3.04	.002	.09
Wetlands	0	0	1.67	.095	.05
Unconsolidated Shore	2	2	0.10	.918	.00
Bare Land	-21	-25	2.43	.015	.07
Water	0	0	0.90	.368	.03
Aquatic Beds	-1	-5	1.19	.233	.03
Snow/Tundra	0	0	0.00	.997	.00

¹Cell entries are percent changes of land cover from 1992 to 2016

As one land cover type goes up, another land cover type must go down. While forest overall decreased in the region for both groups over time, areas with lower perceptions saw significantly less decrease. The percentage is small, but a lot of the area in the region is forested, so the actual amount of land is large. The two maps below show land cover in 2016 (top) and 1992 (bottom) around the area of Port Angeles. This area is one that had a significant cold spot (low perceptions). From these maps, you can see a spread of the dark green color (forest), and a decrease of the yellow (grassland). Port Angeles historically was a timber community (the same could be said for the regions of the other cold spots in the region. This change in forest and grassland could be the source of negative perceptions if these communities believe the governance structures are responsible for the change in forest cover.

Significance

Overall, in this analysis, poor environmental condition relates to positive perceptions, and land cover change may provide insight into reasons for poor perceptions. Areas with negative perceptions may not directly indicate poor natural resource governance. Elsewhere, trust in natural resource governance was primarily driven by individual value orientations (Manfredo et. al. (2017). The three areas identified as cold spots in this study could be areas where individuals do not feel their values are represented by the natural resource governing systems. Conversely, hot spot area A is located near the headquarters of the Puget Sound Partnership, potentially facilitating trust, representation, and access to information that may influence the perceptions of individuals located nearby. This urban, liberal, developed area contrasts the area to the west with a highly significant cold spot (Area 2). Individuals from area two, a more rural, conservative, historic logging site likely have different value orientations than those from Area A. Similar comparisons of the other cold spots could be made to areas near them. More research should be conducted in this area to determine if value orientations have a significant influence on perceptions of natural resource governance.

Your learning

During this course I significantly advanced my knowledge in ArcGIS Pro and in R studio. I learned how to run many new tools in Arc including hotspot, geographically weighted regression, and tabulate area. I also learned new skills in troubleshooting problems. For example, I was having difficulty getting the rank data to display on my map. Another student taught me that as I imported the data from a .csv and joined it to a shapefile, I would need to create a new feature class from it before it would be able to display. I also gained knowledge in how to run more spatial analyses in R. These included many spatial autocorrelation analyses, and geographically weighted regression (which involved creating a point object!).

What did you learn about statistics?

I first and foremost learned of the existence of many types of spatial statistics. These included statistics Hotspot (Getis-Ord Gi*), spatial autocorrelation (correlogram, semivariogram, Moran’s I), geographically weighted regression (GWR). For hotspot, I learned what the hotspot clusters mean, and the best ways to pick the fixed-distance band of the neighbors the tool looks at (either through selecting the number of minimal points near it, or by looking at the z-score at the distance that shows the highest amount of spatial autocorrelation). Speaking of spatial autocorrelation, I learned what that meant in terms of the correlation of perceptions of governance between individuals depending on their distance from each other. Finally, I learned a little about GWR, including that it is necessary to plot the output to examine patterns over the space in question, and I also learned that the reason Arc does not display p-values for independent variables is not something to due with the equation, but actually due to computing power of Arc.

Exercise 1: Preparing for Point Pattern Analysis

Exercise 1

The Question in Context

In order to answer my question: are the dolphin sighting data points clustered along the transect surveys or do they have an equal distribution pattern? I need to use point pattern analysis. I am trying visualize where in space dolphins were sighted along the coast of California, specifically from my San Diego sighting area. In this exercise, the variable of interest is dolphin sightings. These are x,y coordinates (point data) indicating the presence of common bottlenose dolphins along a transect. However, these transect data were not recorded and I needed to recreate these lines to my best abilities. This process is more challenging than anticipated, but will prove useful in the short-term view of this class and project and long-term in management ramifications.

The Tools

As part of this exercise, I used ArcMap 10.6, GoogleEarth, qGIS, and Excel. Although I was only intending on importing my Excel data, saved as a .csv file into ArcMap, that was not working, so other tools were necessary. The final goal of this exercise was to complete point-pattern analyses comparing distance along recreated transects to sightings. From there, the sightings would be broken down by year, season, or environmental factor (El Niño versus La Niña years) to look for distributing patterns, specifically if the points were ever clustered or equally distributed at different points in time.

Steps/Outputs/Review of Methods and Analysis

My first step was to clean up my sightings data enough that it could be exported as a .csv and imported as x-y data into ArcMap. However, ArcMap, no matter the transformation equation, seemed to understand the projected or geographic coordinate systems. After many attempts, where my data ended up along the east coast of Africa or in the Gulf of Mexico, I tried a work around; I imported the .csv file into qGIS with the help of a classmate, and then exported that file as a shape file. Then, I was able to import that shape file into ArcMap and select the correct geographic and projected coordinate systems. The points finally appeared off the coast of California.

I then found a shape file of North America with a more accurate coastline, to add to the base map. This step will be important later when I add in track lines, and how the distributions of points along these track lines are related to bathymetry. The bathymetric lines will need to be rasterized and later interpolated.

The next step was the track line recreation. I chose to focus on the San Diego study site. This site has the most data and the most consistently and standardly collected data. The surveys always left the same port of Mission Bay, San Diego, CA traveled north at 5-10km/hr to a specific beach (landmark), then turned around. It is noted on sighting data whether the track line was surveyed on both directions (South to North and North to South), or unidirectional (South to North). Because some data were collected prior to the invention of a GPS and the commercial availability, I have to recreate these track lines. I started trying to use ArcMap to draw the lines but had difficulty. Luckily, after many attempts, it was suggested that I use Google Earth. Here I found a tool to create a survey line where I can mark the edges along the coastline at an approximate distance from shore, and then export that file. It took a while to realize that the file needed to be exported as a .kml and not a .kmz.

Once exported as a .kml, I was able to convert the .kml file to a layer file and then to a shape file in ArcMap. The next step in this is somehow getting all points within one kilometer of the track line (my spatial scale for this part of the project) to associate with that track line. One idea was snapping the points to the line. However, this did not work. I am still stuck here: the major step before I can have my point data with an association to the line and then begin a point pattern analysis in ArcMap and/or R Studio.

Results

Although I do not currently have results of this exercise, fully. I can say for certain, that it has not been without trying, nor am I stopping. I have been brainstorming and milking resources from classmates and teaching assistants about how to associate the sighting data points with the track line to then do this cluster analysis. Hopefully, based on this can be exported to R studio where I can see distributions along the transect. I may be able to do a density-based analysis which would show if different sections along the transect, which I would need to designate and potentially rasterize first, have different densities of points. I would expect the sections to differ seasonally.

Critiques

Although I add in my opinions on usefulness and ease above, I do believe this will be very helpful in analyzing distribution patterns. Right now, it is largely unknown if there are differences in distribution patterns for this population because they move rapidly and at great distances. But, by investigating data from only the San Diego site, I can determine if there are differences in distributions along the transects temporally and spatially. In addition, the total counts of sightings in each location per unit effort will be useful to see the influx to that entire survey area over time.

Contact information: this post was written by Alexa Kownacki, Wildlife Science Ph.D. Student at Oregon State University. Twitter: @lexaKownacki

Ex 1: Mapping the stain: Using spatial autocorrelation to look at clustering of infection probabilities for black stain root disease

My questions:

I am using a simulation model to analyze spatial patterns of black stain root disease of Douglas-fir at the individual tree, stand, and landscape scales. For exercise 1, I focused on the spatial pattern of probability of infection, asking:

What is the spatial pattern of probability of infection for black stain root disease in the forest landscape?
How does this spatial pattern differ between landscapes where stands are clustered by management class and landscapes where management classes are randomly distributed?
Fig 1. Left: Raster of the clustered landscape, where stands are spatially grouped by each of the three forest management classes. Each management class has a different tree density, making the different classes clearly visible as three wedges in the landscape. Right: Raster of the landscape where management classes are randomly assigned to stands with no predetermined spatial clustering. The color of each cell represents the value for infection probability of that cell. White cells in both landscapes are non-tree areas with NA values.

Tool or approach that you used: Spatial autocorrelation analysis, Moran’s I, correlogram (R)

My model calculates probability of infection for each tree based on a variety of tree characteristics, including proximity to infected trees, so I expected to see spatial autocorrelation (when a variable is related to itself in space) with the clustering of high and low values of probability of infection. Because some management practices (i.e., high planting density, clear-cut harvest, thinning, shorter rotation length) have been shown to promote the spread of infection, there is reason to hypothesize that more intensive management strategies – and their spatial patterns in the landscape – may affect the spread of black stain at multiple scales.

I am interested in hotspot analysis to later analyze how the spatial pattern of infection hotspots map against different forest management approaches and forest ownerships. However, as a first step, I needed to show that there is some clustering in infection probabilities (spatial autocorrelation) in my data. I used the “Moran” function in the “raster” package (Hijmans 2019) in R to calculate the global Moran’s I statistic. The Moran’s I statistic ranges from -1 (perfect dispersion, e.g., a checkerboard) to +1 (perfect clustering), with a value of 0 indicating perfect randomness.

Moran’s I = -1

Moran’s I = 0

Moran’s I = 1

I calculated this statistic at multiple lag distances, h, to generate a graph of the values of the Moran’s I statistic across various values of h. You can think of the lag distance of the size of the window of neighbors being considered for each cell in a raster grid. The graph produced by plotting the calculated value of Moran’s I across various lag values is called a “correlogram.”

What did I actually do? A brief description of steps I followed to complete the analysis

1. Imported my raster files, corrected the spatial scale, and re-projected the rasters to fall somewhere over western Oregon.

I am playing with hypothetical landscapes (with the characteristics of real-world landscapes), so the spatial scale (extent, resolution) is relevant but the geographic placement is somewhat arbitrary. I looked at two landscapes: one where management classes are clustered (“clustered” landscape), and one where management classes are randomly distributed (“random”). For each landscape, I used two rasters: probability of infection (continuous values from 0 to 1) and non-tree/tree (binary, 0s and 1s).

2. Masked non-tree cells

Since not all cells in my raster grid contain trees, I set all non-tree cells to NA for my analysis in order to avoid comparing the probability of infection between trees and non-trees. I used the tree rasters to create a mask.
c.tree[ c.tree < 1 ] <- NA # Set all non-tree cells in the tree raster to NA c.pi.tree <- mask(c.pi, c.tree) # Combine the prob inf with tree, leaving all others NA # Repeat with randomly distributed management landscape r.tree[ r.tree < 1 ] <- NA # Set all non-tree cells in the tree raster to NA r.pi.tree <- mask(r.pi, r.tree) # Combine the prob inf with tree, leaving all others NA

Fig 2. Filled and hollow weights matrices.

3. Calculated Global Moran’s I for multiple values of lag distance.

For each lag distance, I created a weights matrix so the Moran function in the raster package would know how to weight each neighbor pixel at a given distance. Then, I let it run, calculating Moran’s I for each lag to create the data points for a correlogram.

I produced two correlograms: one where all cells within a given distance (lag) were given a weight of 1 and another “hollow” weights matrix when only cells at a given distance were given a weight of 1 (see example below).

4. Plotted the global Moran’s I for each landscape and compare.

What did I find? Brief description of results I obtained.

The correlograms show that similar values become less clustered at greater distances, approaching a random distribution by about 50 cell distances. In other words, cells are more similar to the cells around them than they are to more-distant cells. The many peaks and troughs in the correlogram are present because there are gaps between trees because of their regular spacing in plantation management.

In general, the highest values of Moran’s I were similar between the landscape with clustered management landscape and the landscape with randomly distributed management classes. However, the rate of decrease in the value of Moran’s I with increasing lag distance was higher for the random landscape than the clustered landscape. In other words, similar infection probabilities had larger clusters when forest management classes were clustered. For the clustered landscape, there was actually spatial autocorrelation at lag distances of 100 to 150, likely because of the clusters of higher infection probability in the “old growth” management cluster.

Correlogram for the clustered and random landscape showing Moran’s I as a function of lag distance. “Filled” weights matrix.

Correlogram for the clustered and random landscape showing Moran’s I as a function of lag distance. “Hollow” weights matrix.

Critique of the method – what was useful, what was not?

My biggest issue initially was finding a package to perform a hotspot analysis on raster data in R. I found some packages with detailed tutorials (e.g., hotspotr), but some had not been updated recently enough to work in the latest version of R. I could have done this analysis in ArcMap, but I am trying to use open-source software and free applications and improve my programming abilities in R.

The Moran function I eventually used in the raster package worked quickly and effectively, but it does not provide statistics (e.g., p-values) to interpret the significance of the Moran’s I values produced. I also had to make the correlogram by hand with the raster package. Other packages do include additional statistics but are either more complex to use or designed for point data. There are also built-in correlogram functions in packages like spdep or ncf, but they were very slow, potentially taking hours on a 300 x 300 cell raster. That said, it may just be my inexperience that made a clear path difficult to find.

References

Glen, S. 2016. Moran’s I: Definition, Examples. https://www.statisticshowto.datasciencecentral.com/morans-i/.

Robert J. Hijmans (2019). raster: Geographic Data Analysis and Modeling. R package version 2.8-19. https://CRAN.R-project.org/package=raster

Exercise 1: The Spatial Patterns of Natural Resource Governance Perceptions

Question

What are the spatial patterns of natural resource governance perceptions in the Puget Sound?

Tools and Approaches

Moran’s I (with correlograms) and Semivariograms in R studio
Kriging and IDW in ArcGIS Pro
Hotspot Analysis in ArcGIS Pro

Analysis Steps

To compute Moran’s I, I used the “ape” library in R which has a function called Moran.I(). This function takes the variable in question (governance perceptions), and a distance matrix to compute the observed and expected values of Moran’s I, as well as the standard deviation and a p-value. For this analysis, I also subset my data to examine spatial autocorrelation by demographics including area (urban, suburban, rural), political ideology, life satisfaction, income, and cluster (created by running a cluster analysis on the seven variables which comprise the governance index). I created correlograms for the variables that were significant (urban, conservative, and liberal) using the “ncf” library and the correlog() function. These figures give a better picture of spatial autocorrelation at various distances. To create semivariograms, I used the “gstat” and “lattice” libraries which contain a function called variogram. This function takes the variable of interest along with latitude and longitude locations. The object created can then be plotted. For this analysis I used the same subsets as in the Moran’s I analysis.
To preform interpolation on my data, I loaded my point data into ArcGIS Pro. I then used the Spatial Analysis toolbox to preform Kriging and IDW to compare the outputs of the two techniques. I used my indexed variable of governance perceptions. The values of the variable vary from 1 to 7. I then also uploaded a shapefile bounding the sample area, as well as a shapefile of shoreline, to delineate my study area better.
To run a hotspot analysis I used my previously loaded point data inArcGIS Pro. I then used the Spatial Analysis toolbox to preform ‘hotspot analysis.’ I used my indexed variable of governance perceptions with values from 1 to 7. I used the shapefile of shoreline to delineate my study area better.

Results

The Moran’s I calculation was insignificant for rural, suburban, cluster groups, life satisfaction, and income, suggesting no spatial autocorrelation of governance perceptions by these subsets.

The Moran’s I calculation was significant for urban:

Observed value: -0.014

P-value: 0.0002

The Moran’s I calculation was also significant for ideology:

Conservative

Observed value: -0.006

P-value: 0.002

Liberal

Observed value: -0.002

P-value: 0.05

This suggests that in these subsets there is spatial autocorrelation between individual governance perceptions.

The semivariograms for the subsets that are significantly spatially autocorrelated are presented below.

None of these plots suggest high degrees of spatial autocorrelation. The urban plot does so more than the ideology plots, but the y axis scale is still very small.

The plot (top Urban, bottom left Liberal, bottom right Conservative) help to confirm the findings from above. The Moran’s I fluctuates around zero without much variation. The large spike in variation that the graphs do show are only for non significant points. Significant points are filled in, where non-significant points are open circles.

2. Interpolation

The kriging (bottom left) with individual points and IDW (bottom right), do not look incredibly different in terms of general trends. The kirging with shoreline (top) gives possibly the most interesting visual of spatial patterns. In general, perceptions are better (more green) in the center, where there is greater shoreline. There are also two sections that appear much more negative. To examine these locations further, I preformed a hotspot analysis.

3. Hotspot Analysis

This image confirms the two bright red spots from the interpolation to be “cold spots” or spots that the value of perception is significantly lower than the average perception (neutral) at a 99% confidence. The orange dots are a 95% confidence. The green corridor appears to hold in the southern part of the Sound and is confirmed at a “hotspot” or a spot that the value of perception is significantly higher than the areas surrounding it at a 99% confidence level.

The three main areas of red or orange correspond to the cities of Shelton (bottom), Port Angeles (west), and Everett with a little of south Whidbey Island (east).

Critique

I believe all methods are useful, but some are redundant. I think it would probably be sufficient to do only one of each type of method—spatial autocorrelation and interpolation—but it is interesting and more convincing to see the same type of analysis done in different ways. The p-values from the Moran’s I appear to agree with the shape of the curve’s in the semivariograms, where the smaller p-values have more defined shapes. The same goes for the interpolation methods, while they are interesting to see side-by-side, they essentially tell the same story. I think in this case, the hotspot analysis shows the most interesting interpretation of the data because it indicates areas of potential concern.

GEOG 566

Advanced spatial statistics and GIScience