The research question and Background

Soil microbes are critically important for soil to function, because they provide the functional backbone for many soil ecosystem services such as the carbon and nitrogen cycles. The structure and composition of microbial communities can have a large influence on the rates at which these ecosystem services can be performed; therefore, determining the spatial distribution of these microbial communities is necessary to establish accurate functionality of soils across a landscape. However, this can be very difficult due to the fact that microbial communities are extremely diverse. In one gram of soil there can be over a billion microbes with anywhere from 4,000 to 50,000 different species. There are two different potential forces governing the spatial distribution of microbial communities. The first theory, first popularized by Baas and Becking (1934), stated that “Everything is everywhere but, the environment selects”. This idea is in line with the classical ecology niche theory, in which species distribution is dictated by their niche. The opposite of this paradigm is the idea that microbes are more limited by dispersal potential then environmental selection factors. This idea reflects the ideas of neutral theory, which states that species distributions are completely stochastic, and at any given location, the species composition is determined by the species composition of its neighboring areas.
Until recently, these theories have not really been tested in microbial communities. However, with the advent of high throughput sequencing in the 1990s, an in situ view of microbial communities became possible. In a recent study, O’Brien et al. (2016) determined that at a fine scale (1cm2) community composition was extremely heterogeneous and correlated weakly with environmental factors; however, as the scale and scope increased, patterns in microbial community distribution began to emerge and were closely linked to environmental parameters. This study illustrates that when examining microbial communities, the spatial scale of the study will influence your results.
In my study, we want to know which principles govern microbial distributions across a regional landscape. Are microbial communities across a regional landscape influenced by environmental parameters, or is their distribution dictated more by dispersal limitation? There has been some work on a similar sized scale that has determined that bacterial diversity is more closely linked to environmental parameters then dispersal limitation. Specifically studies from all over the world have found the most influential environmental factor to be soil pH (Fierer and Jackson 2006, Griffiths et al. 2011).
The Dataset
During the spring of 2015, 113 soil samples were collected across the state of Oregon to capture the spatial pattern of microbial community structure across a regional landscape. Sampling locations were stratified by Common Resource Areas (CRA), which are geographically associated land resource regions based on geology, climate, water, soils, biological resources, and land use. This stratification was used to help capture all of the spatial heterogeneity of the soil across a regional landscape. A subset of this data (90 out of 113) was previously sampled by the NRCS in soil series survey, where an extensive lab work up was conducted to catalog each of these soils’ physical and chemical properties.
Hypothesis
The physical and chemical properties in soil will have a greater influence on microbial communities than spatial variability. In essence, on a regional landscape scale, the factors dictating microbial community composition will be linked more closely to environmental factors in the soil rather than how geographically distant two samples are.
If this is true then we need to be able to quantify the heterogeneity of a landscape of these environmental parameters. In this blog post I will use the 90 sample points from the NRCS to see how well interpolation methods preform when examining the spatial heterogeneity of edaphic properties, using soil pH as a proxy. To examine the accuracy of these interpolation methods we use the SSURGO database of soil pH as our truth. This will allow me to compare the interpolation techniques and determine their strength and weaknesses.
Approaches
The first method used was spatially extrapolating pH point data over their relative CRA polygon. In theory since these CRA polygons were delineated by similar environmental factors, the edaphic properties such as soil pH should be conserved within each polygon.
Inverse Distance Weighted (IDW) Interpolation method was also attempted. This method essentially determines the value of any unknown point by taking weighted averages of all known points, where the weights are determined by their distances to the unknown point.
The last interpolation method used was ordinary kriging. Just as IDW only considers distances between points for interpolation, so does kriging; however, unlike IDW kriging incorporates the fact that after a certain distance there is no more useful information to explain the variation at a given point, determined through the variogram. Therefore, kriging uses a variogram model to determine the maximum distance from any given point that information should come from. The figure below shows the relationship between the variance explained and distance from the given point. The fitted line is the model used in the ordinary kriging interpolation and how the weights for the average are determined.

Variogram

After interpolation, these maps were then compared to the SSURGO map to test their performance. This comparison was done by finding the absolute difference at each pixel and summing the variance observed. Since the SSURGO map was not completely finished for all of Oregon, each of the interpolated maps were clipped to same spatial extent as the SSURGO map.
Results

The map shown below is the soil pH values extrapolated over its respective CRA. You can see the Willamette Valley is quite homogeneous in its soil pH values while Oregon’s Eastern side is quite a bit more heterogeneous. The problem with this map is it draws sharp lines between CRA polygons where in theory the real pH values would have a continuum as you moved from one CRA to another.

CRA

The IDW map seemed to show the same pattern in the Willamette valley; however, the heterogeneity of eastern side of Oregon seems to be blended together. The IDW interpolation also seemed to develop pockets of like pH values which look more like artifacts of the IDW algorithm than real characteristics of the landscape.

IDW_plot
Like IDW, ordinary kriging also blended the heterogeneity of the eastern side of Oregon together; however, it did so without creating many of these artificial pockets that the IDW method seemed to create.

kriging
The main quantitative difference between the SSURGO map and the interpolated maps can be seen in the western part of Oregon. The SSURGO map shows a much higher level of heterogeneity compared to the interpolated maps as seen in the Willamette Valley. It also shows a large amount of heterogeneity in the southeastern side of Oregon that was only captured in the CRA map.

SSURGO_map

As expected, the IDW map performed the least favorably (a variance score of 2676) while both the kriging and CRA map had very similar results (variance scores of 2440 and 2414, respectively). This is quite surprising since the kriging map seemed to have a homogenizing effect on soil pH in eastern Oregon which was present in both the CRA map and the SSURGO map. Below is a map of the difference in pH between the CRA map and the SSURGO values of pH. You can see the key areas in which the CRA failed were areas of high heterogeneity of the SSURGO map.

DIfference map

Significance
Through this analysis we have determined that across this landscape there is a large amount of heterogeneity in edaphic properties. There may be significantly correlated parameters that have not been captured in the SSURGO database. If so, it may be inappropriate to interpolate these values using just our sampling points.
Finally, if it is determined that geographic distance is the best proxy for determining microbial community distribution at this scale, an interpolated map of the community needs to be presented in a way in which the audience understands where the map has strong areas of success and areas in which we are quite sure the map is wrong.
What I Learned: Software
The majority of the project was conducted in R’s statistical environment, which is due to several reasons. First, I was already familiar with the programming language. Moreover, R is quite powerful in its data manipulation capability and its ability to execute complex statistics in its base functions. Because R is also completely free and availability to anyone, it allows me to conduct my analyses without relying on a paid programs. The main drawback to this particular statistical software is it has a rather steep learning curve which made AcrGIS more preferable in some aspects. I found ArcGIS rather useful in its ability to quickly display results and data with minimal effort. It is able to quickly display a large amount of information very quickly in a user-friendly way. However, when attempting statistical interpolation methods in ArcGIS, I felt rather limited in its customizability and was completely ignorant of the underlying processes it was doing during interpolation.
R on the other hand requires more work upfront to create and display raster and vector files; however, once the data is in, manipulating it and doing analytical analysis on it was quite straightforward. When conducting spatial analysis, R is transparent; each function comes with its own documentation about what exactly is happening during the execution of the function. It also requires the users to have a working understanding of the basic methodology when doing spatial analysis, which is always a good thing.
All in all, both programs have their niches and best uses; however, one should not feel limited to just one. Exploiting the strengths of each program and bypassing their relative weakness by shifting to different platforms should be everyone’s goal when conducting any sort of analysis including spatial analysis.

What I Learned: Spatial Statistics
Before joining this class I had a very limited understanding of spatial statistics. Through the class exercises and a lot of reading I developed a comfortable understanding of the basic idea behind the process used in several different spatial statistic methods. Some of these models include IDW, kriging, geographically weighted regression, PCA ordination, and boosted regression tree. However, I feel the most important lessons I learned how to critically interpret the results of these analysis such as geographically weighted regression in ways that make intuitive sense.
Work Cited
Fierer, Noah, and Robert B. Jackson. “The Diversity and Biogeography of Soil Bacterial Communities.” Proceedings of the National Academy of Sciences of the United States of America 103, no. 3 (January 17, 2006): 626–31. doi:10.1073/pnas.0507535103.

Griffiths, Robert I., Bruce C. Thomson, PHillip James, Thomas Bell, Mark Bailey, and Andrew S. Whiteley. “The Bacterial Biogeography of British Soils.” Environmental Microbiology 13, no. 6 (June 1, 2011): 1642–54. doi:10.1111/j.1462-2920.2011.02480.x.

O’Brien, Sarah L., Sean M. Gibbons, Sarah M. Owens, Jarrad Hampton-Marcell, Eric R. Johnston, Julie D. Jastrow, Jack A. Gilbert, Folker Meyer, and Dionysios A. Antonopoulos. “Spatial Scale Drives Patterns in Soil Bacterial Diversity.” Environmental Microbiology, March 1, 2016, n/a-n/a. doi:10.1111/1462-2920.13231.

The goal for this tutorial is to expose users to interpolation methods using packages in R. I found doing these in R you get very informative insight into how these interpolation methods operate since you have to specify specific options instead of clicking the krigging function in ArcGIS.

Just a quick refresher: 1) First I need to load the necessary libraries. The main libraries I need for interpolation are gstat and raster 2) Import all the data that we need The first data I will be entering is a polygon shape file which I will use for just projection and aesthetic for displaying the maps. There is also a .csv file that I will import which has all the points I need. I will need to convert this to a point vector datatype. The last time, I was able to extrapolation the point layer pH values over each of their respective polygons. If you more detail on how exactly this was accomplished please read my previous blog post labeled: Spatial Data Management and Simple Plotting with R.

x lapply(x, library, character.only=T)

## This is a polygon shapefile
CRA

## OGR data source with driver: ESRI Shapefile
## Source: "Data/NRCS_Data", layer: "Final_CRA"
## with 60 features
## It has 4 fields

## Reading in a .csv file with all the attributes and removing all the missing pH values.
NRCS_data c

NRCS_data

## Creates a matrix of lat long coord
NRCS_mat row.names(NRCS_mat) ## Takes the projection from the CRA shapfile
CRA.projection

## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"

## Creates a spatail dataframe with the projection as our CRA shapefile
NRCS NRCS@data

## aggergating my atributes by pH finding the mean pH for each CRA
agg_NRCS_data agg_NRCS_data

## merging the 2 datasets together by CRA
CRA@data

## Joining by: "Final_CRA"

## Warning in left_join_impl(x, y, by$x, by$y): joining character vector and
## factor, coercing into character vector

CRA@data

## ploting the extraploated CRA polygon layer. An indepth expliation of this code is in with the IDW code.
tm_shape(CRA)+ tm_fill("pH_water",breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), auto.palette.mapping=F, legend.show = F)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("red", size = .1, border.col="black")+tm_layout(title="Extrapolation of pH Across CRAs", title.position = c("center", "top"))

CRA

 

In order to do any form of interpolation you need to create a form of resolution that you want the interpolation to happen. In ARCGIS this is done automatically; however, in R you need to create a raster layer that has your specified resolution that you want. Here I create a blank raster layer that has 100 rows and 100 columns. You can change our resolution to what ever you want by editing ncol and nrow variables.


## creating a raster grid usign the raster package
grd ## "Spreading" grid to ecompass the entire CRA polygon layer
extent(grd)

## giving the grid a projection
crs(grd)

## changing the variable class from a raster to a SpatialPixel. They are both rasters; however, this is the format that the package gstat preferrs.
NRCS.raster crs(NRCS.raster) gridded(NRCS.raster)

## This is jusst a quick plot ot see that the raster grid has the correct projection and matches up correctly.
plot(NRCS.raster)
lines(CRA, col="blue")

grid_plot

This is a plot of the polygon layer over the empty gridded raster layer.

Now we have all of our layers, its time to create do the inverse distance weighting (IDW) and Ordinary Kriging. IDW and krigging are both only use geographic distance from know points to unknown points. IDW is simpler since it is a weighted average between unknown points strictly based on distance. Kriging is a little more sophisticated since a model of the semivariogram is created to calculate the weights.

OK, first we are going to do the IDW, to do this we need both the points and the z-value we want to interpolate and the resolution that we want that to happen on (our blank raster layer).

## In gstat the IDW function is called idw. We want to interpolate the pH_water variable in the NRCS point shape file over the resoltion of the NRCS.raster layer.
idw.out <- gstat::idw(pH_water~1, NRCS, newdata=NRCS.raster)

## [inverse distance weighted interpolation]

## Here we are backing up the IDW raster layer and trimming it to the same width as our CRA polygon layer.
idw.out.BU <- idw.out
idw.out <- idw.out.BU[CRA,]

## Now to plot i, I know this piece of code loos indtimidating; however, that is only becuase you have create every little aspect about your plot layer by layer. First comes the ph IDW raster layer. I want the colors of the raster layer colors to be partitioned by the bins of pH that I speficfy after break. I also want to turn off the legend since it is a little cluttered. Next layer is the polygon layer and I and jsut displaying the outlines of each polygon. The last layer is the Point layer inwhich I am displaying each point as blue. tm_layout is jsut for formating the title posistioning it in the top center.

tm_shape(idw.out)+tm_raster(breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), auto.palette.mapping=F, legend.show = F)+tm_shape(CRA)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("blue", size = .1, border.col="black")+tm_layout(title="Interpolation using IDW", title.position = c("center", "top"))

IDW_plot

Now for Kriging. To do krigging in R first you need to examine the semivariagram and then model the semi variance and use that model for your interpolation.

## Creating the semivariaogram and plotting it.
cvnrcs<-variogram(pH_water~1, locations=NRCS, data=NRCS)
plot(cvnrcs)

VAR_1_plot

## Though this variogram does not look great for the purposes of this exerice it will work. However, in reality I would be quite hesitant to do any more of interpolation over this data.
##Looking at the variogram I choose to go withe a Spherical model that has a sill of .06, a nugget of .01 and a range of 100.
nrcs.fit.sph <- fit.variogram(cvnrcs, model = vgm(psill=.06, model="Sph", nugget=.01, range=100))
plot(cvnrcs, nrcs.fit.sph)

Var_2_plot

## The fit isn't completely bad; however, it is not great.
## Now to create the Ordinary Kriged raster layer and then back it up and trim it to the right size as the polygon layer.
pH.ok <- krige(pH_water~1,NRCS, NRCS.raster, model=nrcs.fit.sph)

## [using ordinary kriging]

pH.ok.BU <- pH.ok
pH.ok <- pH.ok.BU[CRA,]

## This is essintially the same plotting code as the IDW plot; however, the raster laster I am using is the one created by the ordinary kriged layer.
tm_shape(pH.ok)+tm_raster(breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), title="Soil pH", auto.palette.mapping=F, legend.show = F)+tm_shape(CRA)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("blue", size = .1, border.col="black")+tm_layout(title="Interpolation using Ordinary Kriging", title.position = c("center", "top"))

krig_plot

In examining the plots you can see that the pH values on the western side of Oregon pretty well conserved since there is not much heterogeneity. However, as you move east you see that each map displays something different. This is because there are more parameters dictating pH other than just distance. Also the sampling density of eastern Oregon is not dense enough to capture the heterogeneity of the landscape as you can see how the IDW and ordinary krigging blend a lot of it together.

However, now you know how to use this amazing functionality in R and this is only scratching the surface. Though R may not be as user friendly and its graphic capabilities may not be as polished, its ability to truly customize each step of the analysis makes it well worth the effort in learning how to conduct these analysis.

Though diving into R can be a bit intimidating, only a basic understanding of R is required to beginning analysis your spatial data. First you need to install the right packages, there are a lot of spatial analysis packages in R which can be found here: https://cran.r-project.org/web/views/Spatial.html . However, the ones I’ll be using are, “rgdal”, “rgeos”, “maptools”, “dplyr”, “tidyr”, “tmap”.
You can install these by (without the # sign):
#install.packages(c("rgdal", "rgeos", "maptools", "dplyr", "tidyr", "tmap"))

In this example we’ll be playing with my data which you will not have access too so here is a tutorial that will definitely help you get started if you are really interested in it. https://cran.r-project.org/doc/contrib/intro-spatial-rl.pdf
First we need to make sure the right libraries(packages) are loaded. If you Google each one of these with followed by “in R” it will come up with the help documentation behind the package and what it does. However, I will mention that the package ‘sp’ is quite important, it is the bases behind all other spatial analysis packages here since it defines the spatial data class in R. It allows R to recognize and deal with spatial vector data and raster data.
x <- c("rgdal", "rgeos", "maptools", "dplyr", "tidyr", "tmap")
lapply(x, library, character.only=T)

Now we have the packages lets go ahead and read in my data. The first file I have is just a simple .csv file on different soil properties. read.csv is one way you import data into R that is a .csv.
##reading in a .csv file with all the attributes
##reading in a .csv file with all the attributes
NRCS_data <- read.csv ("Data/NRCS_Data/Agg_NRCS_Data.csv", header=T, stringsAsFactors = F)

Next we are going to use readOGR function in the package rgdal.
##This is a polygon shapefile
CRA <- readOGR(dsn="Data/NRCS_Data", layer="Final_CRA")
## OGR data source with driver: ESRI Shapefile
## Source: "Data/NRCS_Data", layer: "Final_CRA"
## with 60 features
## It has 4 fields
CRA
## class : SpatialPolygonsDataFrame
## features : 60
## extent : -124.566, -116.463, 41.99208, 46.28576 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
## variables : 4
## names : OBJECTID, Final_CRA, Shape_Leng, Shape_Area
## min values : 1, 1.1, 2.338222, 0.03185210
## max values : 60, 9.9, 53.866012, 3.83380590

It is important to note that it takes 2 parameters first dsn= is just where the folder with the shapefile is saved. layer= is the name of the shapefiles that you want to read in. We are reading in the shapefile “Final_CRA”, which is a polygon shapefile and storing it in R under the name CRA. You can see it is a spatialpolygonDataframe. There are 2 real important aspects to this data format first CRA@polygons and CRA@data. @polygons refers the the information on the polygons and their coordinates. What @data refers to their attribute table.
head(CRA@data, n=3)
## OBJECTID Final_CRA Shape_Leng Shape_Area
## 0 1 1.1 8.832671 0.458048
## 1 2 1.6 17.469240 1.406368
## 2 3 10.1 24.579325 1.115243

you can see there isn’t any useful information associated with each polygon about soil parameters; however, the NRCS_data has interesting information that we want to link to these polygons.
##aggergating my atributes by pH finding the mean pH for each CRA
agg_NRCS_data <- aggregate(ph_1_to_1_H20~CRA, FUN=mean, data=NRCS_data)
agg_NRCS_data <- rename(agg_NRCS_data, Final_CRA = CRA)
head(agg_NRCS_data, n=3)
## Final_CRA ph_1_to_1_H20
## 1 1.1 5.150000
## 2 1.6 5.153333
## 3 10.1 7.153333

This data has pH values and their corresponding CRA.
Using a package called dplyr, I am able to attach the pH data to the CRA polygon data.
##merging the 2 datasets together by CRA
CRA@data <- left_join(CRA@data, agg_NRCS_data)
## Joining by: "Final_CRA"
## Warning in left_join_impl(x, y, by$x, by$y): joining character vector and
## factor, coercing into character vector
CRA@data <- rename(CRA@data, pH_water = ph_1_to_1_H20)
head(CRA@data, n=3)
## OBJECTID Final_CRA Shape_Leng Shape_Area pH_water
## 1 1 1.1 8.832671 0.458048 5.150000
## 2 2 1.6 17.469240 1.406368 5.153333
## 3 3 10.1 24.579325 1.115243 7.153333

Simple, now the polygon dataframe has some interesting metadata associated with each polygon. However, suppose you want to just creat a point layer spatial dataframe instead of merging it to your polygon dataframe, which you can do in the sp package. The function SpatialPointDataframe is quite particular in its required parameters. It requires the coordinates to be in matrix format I am pulling the corresponding columns from the NRCS_data dataframe and converting them into a matrix.
##Creates a matrix of lat long coord
NRCS_mat <- with(NRCS_data, cbind(longitude, latitude))
row.names(NRCS_mat)<- 1:nrow(NRCS_mat)
##Takes the projection from the CRA shapfile
CRA.projection <- CRS(print(proj4string(CRA)))
## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
##Creates a spatail dataframe with the projection as our CRA shapefile
NRCS <- SpatialPointsDataFrame(coords= NRCS_mat, data = NRCS_data, proj4string = CRA.projection)

These last 2 lines of code take the projection my CRA polygon data has and uses it form my newly formed point data dataframe.
Since both spatial dataframes are not projected I use a code from epsg (googled it) that corresponds with Oregon to project them I am creating new spatial dataframes with the projection with the function spTransform()
NRCS.projected <- spTransform(NRCS, CRS("+init=epsg:2992"))
CRA.projected <- spTransform(CRA, CRS("+init=epsg:2992"))

Lastly, we want to see what these layers look like so I use the package tmap for plotting which works a lot like ggplot2. qtm is just like qplot in ggplot2. I tell it the layer I want to plot and I want to fill the polygons with the different pH values in each
qtm(shp=CRA, fill="pH_water")

qtm_pH

tm_shape(CRA.projected)+ tm_fill("pH_water")+tm_borders()+tm_shape(NRCS.projected)+ tm_bubbles("red", size = .1, border.col="black")

pH_map_2

The second line of code actually plots 2 layers the first layer is the polygon dataframe and the second one is the point dataframe. You do this by playing with the customization of tmap typing ?tmap will help show just how customization tmap package is.
Hopefully, this tutorial helped you get started with spatial data in R.

My Spatial Problem

  1. Research Question

Soil microbial communities are extremely complex, because in each gram of soil there are about 109 microorganisms.  Due to this complexity, studying these communities at a species level and understanding any meaningful relationships is difficult.  Therefore, there is a lot of current research looking at how the composition of the community is dictated by environmental parameters in the soil and understanding how the community shifts as these parameters change.  However, before I can begin to examine these microbial communities, I need to explore the spatial distribution of these environmental factors.  If I do find a neat relationship between significant environmental parameters and microbial composition, I want to explore different methods of how to interpolate this sampled relationship over larger, unsampled areas.

  1. The Dataset

The dataset is a set of soil parameters from samples throughout the state of Oregon. These parameters only include edaphic factors such as pH and texture metrics; however, I also hope to include climate factors such as mean annual temperature and precipitation when exploring this relationship between environmental factors and microbial community distribution.   Samples were collected across Oregon to try and encompass all the Common Resource Areas (CRA).  According to the NRCS (Natural Resource Conservation Service), a CRA is “a geographic area where resource concerns, problems, or treatment needs are similar.”  The CRA  geographic partitioning map has a scale of 1:250,000 and considers landscape factors, human use, climate and other natural resource information.

  1. Hypothesis

I hypothesize that these environmental parameters are indeed influenced by spatial autocorrelation; however, due to the low sampling point density it will be difficult to confidently model the distribution of these soil parameters.  For this reason, it may be beneficial to see how conserved these environmental factors are within their respective CRA. If the values of a given environmental variable are similar within a CRA, the CRA polygon may be a more robust unit of analysis than individual sampling points.

  1. Approaches

Firstly, I need to examine the spatial relationships across these environmental factors and see how strong the relationship are.  I also would like to examine how well these environmental factors are conserved within/between CRAs.

  1. Outcome

I will produce a map of environmental factors interpolated across Oregon using either point data or relationships between CRAs and their respective points.

  1. Significance

Soil microbes provide several different ecosystem services such as nutrient cycling; however, understanding the fundamental parameters dictating microbial community distribution through a landscape is not well understood.  Providing a map of the distribution of these microbial communities can help increase accuracy for regional nutrient cycling models and help quantify soil health.

  1. Expertise

I have very little experience with GIS and Python; however, I have done a fair bit of work with R in my undergraduate degree and graduate work.  I hope to gain some proficiency in ArcGIS and learn how to incorporate R scripts into ArcGIS.  There are also several R packages built around spatial analysis in which I hope to become familiar with.