The goal for this tutorial is to expose users to interpolation methods using packages in R. I found doing these in R you get very informative insight into how these interpolation methods operate since you have to specify specific options instead of clicking the krigging function in ArcGIS.
Just a quick refresher: 1) First I need to load the necessary libraries. The main libraries I need for interpolation are gstat and raster 2) Import all the data that we need The first data I will be entering is a polygon shape file which I will use for just projection and aesthetic for displaying the maps. There is also a .csv file that I will import which has all the points I need. I will need to convert this to a point vector datatype. The last time, I was able to extrapolation the point layer pH values over each of their respective polygons. If you more detail on how exactly this was accomplished please read my previous blog post labeled: Spatial Data Management and Simple Plotting with R.
x lapply(x, library, character.only=T)
## This is a polygon shapefile
CRA
## OGR data source with driver: ESRI Shapefile
## Source: "Data/NRCS_Data", layer: "Final_CRA"
## with 60 features
## It has 4 fields
## Reading in a .csv file with all the attributes and removing all the missing pH values.
NRCS_data c
NRCS_data
## Creates a matrix of lat long coord
NRCS_mat row.names(NRCS_mat) ## Takes the projection from the CRA shapfile
CRA.projection
## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
## Creates a spatail dataframe with the projection as our CRA shapefile
NRCS NRCS@data
## aggergating my atributes by pH finding the mean pH for each CRA
agg_NRCS_data agg_NRCS_data
## merging the 2 datasets together by CRA
CRA@data
## Joining by: "Final_CRA"
## Warning in left_join_impl(x, y, by$x, by$y): joining character vector and
## factor, coercing into character vector
CRA@data
## ploting the extraploated CRA polygon layer. An indepth expliation of this code is in with the IDW code.
tm_shape(CRA)+ tm_fill("pH_water",breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), auto.palette.mapping=F, legend.show = F)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("red", size = .1, border.col="black")+tm_layout(title="Extrapolation of pH Across CRAs", title.position = c("center", "top"))
In order to do any form of interpolation you need to create a form of resolution that you want the interpolation to happen. In ARCGIS this is done automatically; however, in R you need to create a raster layer that has your specified resolution that you want. Here I create a blank raster layer that has 100 rows and 100 columns. You can change our resolution to what ever you want by editing ncol
and nrow
variables.
## creating a raster grid usign the raster package
grd ## "Spreading" grid to ecompass the entire CRA polygon layer
extent(grd)
## giving the grid a projection
crs(grd)
## changing the variable class from a raster to a SpatialPixel. They are both rasters; however, this is the format that the package gstat preferrs.
NRCS.raster crs(NRCS.raster) gridded(NRCS.raster)
## This is jusst a quick plot ot see that the raster grid has the correct projection and matches up correctly.
plot(NRCS.raster)
lines(CRA, col="blue")
This is a plot of the polygon layer over the empty gridded raster layer.
Now we have all of our layers, its time to create do the inverse distance weighting (IDW) and Ordinary Kriging. IDW and krigging are both only use geographic distance from know points to unknown points. IDW is simpler since it is a weighted average between unknown points strictly based on distance. Kriging is a little more sophisticated since a model of the semivariogram is created to calculate the weights.
OK, first we are going to do the IDW, to do this we need both the points and the z-value we want to interpolate and the resolution that we want that to happen on (our blank raster layer).
## In gstat the IDW function is called idw. We want to interpolate the pH_water variable in the NRCS point shape file over the resoltion of the NRCS.raster layer.
idw.out <- gstat::idw(pH_water~1, NRCS, newdata=NRCS.raster)
## [inverse distance weighted interpolation]
## Here we are backing up the IDW raster layer and trimming it to the same width as our CRA polygon layer.
idw.out.BU <- idw.out
idw.out <- idw.out.BU[CRA,]
## Now to plot i, I know this piece of code loos indtimidating; however, that is only becuase you have create every little aspect about your plot layer by layer. First comes the ph IDW raster layer. I want the colors of the raster layer colors to be partitioned by the bins of pH that I speficfy after break. I also want to turn off the legend since it is a little cluttered. Next layer is the polygon layer and I and jsut displaying the outlines of each polygon. The last layer is the Point layer inwhich I am displaying each point as blue. tm_layout is jsut for formating the title posistioning it in the top center.
tm_shape(idw.out)+tm_raster(breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), auto.palette.mapping=F, legend.show = F)+tm_shape(CRA)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("blue", size = .1, border.col="black")+tm_layout(title="Interpolation using IDW", title.position = c("center", "top"))
Now for Kriging. To do krigging in R first you need to examine the semivariagram and then model the semi variance and use that model for your interpolation.
## Creating the semivariaogram and plotting it.
cvnrcs<-variogram(pH_water~1, locations=NRCS, data=NRCS)
plot(cvnrcs)
## Though this variogram does not look great for the purposes of this exerice it will work. However, in reality I would be quite hesitant to do any more of interpolation over this data.
##Looking at the variogram I choose to go withe a Spherical model that has a sill of .06, a nugget of .01 and a range of 100.
nrcs.fit.sph <- fit.variogram(cvnrcs, model = vgm(psill=.06, model="Sph", nugget=.01, range=100))
plot(cvnrcs, nrcs.fit.sph)
## The fit isn't completely bad; however, it is not great.
## Now to create the Ordinary Kriged raster layer and then back it up and trim it to the right size as the polygon layer.
pH.ok <- krige(pH_water~1,NRCS, NRCS.raster, model=nrcs.fit.sph)
## [using ordinary kriging]
pH.ok.BU <- pH.ok
pH.ok <- pH.ok.BU[CRA,]
## This is essintially the same plotting code as the IDW plot; however, the raster laster I am using is the one created by the ordinary kriged layer.
tm_shape(pH.ok)+tm_raster(breaks=c(-Inf, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, Inf), title="Soil pH", auto.palette.mapping=F, legend.show = F)+tm_shape(CRA)+tm_borders()+tm_shape(NRCS)+ tm_bubbles("blue", size = .1, border.col="black")+tm_layout(title="Interpolation using Ordinary Kriging", title.position = c("center", "top"))
In examining the plots you can see that the pH values on the western side of Oregon pretty well conserved since there is not much heterogeneity. However, as you move east you see that each map displays something different. This is because there are more parameters dictating pH other than just distance. Also the sampling density of eastern Oregon is not dense enough to capture the heterogeneity of the landscape as you can see how the IDW and ordinary krigging blend a lot of it together.
However, now you know how to use this amazing functionality in R and this is only scratching the surface. Though R may not be as user friendly and its graphic capabilities may not be as polished, its ability to truly customize each step of the analysis makes it well worth the effort in learning how to conduct these analysis.