Though diving into R can be a bit intimidating, only a basic understanding of R is required to beginning analysis your spatial data. First you need to install the right packages, there are a lot of spatial analysis packages in R which can be found here: https://cran.r-project.org/web/views/Spatial.html . However, the ones I’ll be using are, “rgdal”, “rgeos”, “maptools”, “dplyr”, “tidyr”, “tmap”.
You can install these by (without the # sign):
#install.packages(c("rgdal", "rgeos", "maptools", "dplyr", "tidyr", "tmap"))
In this example we’ll be playing with my data which you will not have access too so here is a tutorial that will definitely help you get started if you are really interested in it. https://cran.r-project.org/doc/contrib/intro-spatial-rl.pdf
First we need to make sure the right libraries(packages) are loaded. If you Google each one of these with followed by “in R” it will come up with the help documentation behind the package and what it does. However, I will mention that the package ‘sp’ is quite important, it is the bases behind all other spatial analysis packages here since it defines the spatial data class in R. It allows R to recognize and deal with spatial vector data and raster data.
x <- c("rgdal", "rgeos", "maptools", "dplyr", "tidyr", "tmap")
lapply(x, library, character.only=T)
Now we have the packages lets go ahead and read in my data. The first file I have is just a simple .csv file on different soil properties. read.csv is one way you import data into R that is a .csv.
##reading in a .csv file with all the attributes
##reading in a .csv file with all the attributes
NRCS_data <- read.csv ("Data/NRCS_Data/Agg_NRCS_Data.csv", header=T, stringsAsFactors = F)
Next we are going to use readOGR function in the package rgdal.
##This is a polygon shapefile
CRA <- readOGR(dsn="Data/NRCS_Data", layer="Final_CRA")
## OGR data source with driver: ESRI Shapefile
## Source: "Data/NRCS_Data", layer: "Final_CRA"
## with 60 features
## It has 4 fields
CRA
## class : SpatialPolygonsDataFrame
## features : 60
## extent : -124.566, -116.463, 41.99208, 46.28576 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
## variables : 4
## names : OBJECTID, Final_CRA, Shape_Leng, Shape_Area
## min values : 1, 1.1, 2.338222, 0.03185210
## max values : 60, 9.9, 53.866012, 3.83380590
It is important to note that it takes 2 parameters first dsn=
is just where the folder with the shapefile is saved. layer=
is the name of the shapefiles that you want to read in. We are reading in the shapefile “Final_CRA”, which is a polygon shapefile and storing it in R under the name CRA. You can see it is a spatialpolygonDataframe
. There are 2 real important aspects to this data format first CRA@polygons
and CRA@data
. @polygons refers the the information on the polygons and their coordinates. What @data refers to their attribute table.
head(CRA@data, n=3)
## OBJECTID Final_CRA Shape_Leng Shape_Area
## 0 1 1.1 8.832671 0.458048
## 1 2 1.6 17.469240 1.406368
## 2 3 10.1 24.579325 1.115243
you can see there isn’t any useful information associated with each polygon about soil parameters; however, the NRCS_data has interesting information that we want to link to these polygons.
##aggergating my atributes by pH finding the mean pH for each CRA
agg_NRCS_data <- aggregate(ph_1_to_1_H20~CRA, FUN=mean, data=NRCS_data)
agg_NRCS_data <- rename(agg_NRCS_data, Final_CRA = CRA)
head(agg_NRCS_data, n=3)
## Final_CRA ph_1_to_1_H20
## 1 1.1 5.150000
## 2 1.6 5.153333
## 3 10.1 7.153333
This data has pH values and their corresponding CRA.
Using a package called dplyr
, I am able to attach the pH data to the CRA polygon data.
##merging the 2 datasets together by CRA
CRA@data <- left_join(CRA@data, agg_NRCS_data)
## Joining by: "Final_CRA"
## Warning in left_join_impl(x, y, by$x, by$y): joining character vector and
## factor, coercing into character vector
CRA@data <- rename(CRA@data, pH_water = ph_1_to_1_H20)
head(CRA@data, n=3)
## OBJECTID Final_CRA Shape_Leng Shape_Area pH_water
## 1 1 1.1 8.832671 0.458048 5.150000
## 2 2 1.6 17.469240 1.406368 5.153333
## 3 3 10.1 24.579325 1.115243 7.153333
Simple, now the polygon dataframe has some interesting metadata associated with each polygon. However, suppose you want to just creat a point layer spatial dataframe instead of merging it to your polygon dataframe, which you can do in the sp
package. The function SpatialPointDataframe
is quite particular in its required parameters. It requires the coordinates to be in matrix format I am pulling the corresponding columns from the NRCS_data dataframe and converting them into a matrix.
##Creates a matrix of lat long coord
NRCS_mat <- with(NRCS_data, cbind(longitude, latitude))
row.names(NRCS_mat)<- 1:nrow(NRCS_mat)
##Takes the projection from the CRA shapfile
CRA.projection <- CRS(print(proj4string(CRA)))
## [1] "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
##Creates a spatail dataframe with the projection as our CRA shapefile
NRCS <- SpatialPointsDataFrame(coords= NRCS_mat, data = NRCS_data, proj4string = CRA.projection)
These last 2 lines of code take the projection my CRA polygon data has and uses it form my newly formed point data dataframe.
Since both spatial dataframes are not projected I use a code from epsg (googled it) that corresponds with Oregon to project them I am creating new spatial dataframes with the projection with the function spTransform()
NRCS.projected <- spTransform(NRCS, CRS("+init=epsg:2992"))
CRA.projected <- spTransform(CRA, CRS("+init=epsg:2992"))
Lastly, we want to see what these layers look like so I use the package tmap
for plotting which works a lot like ggplot2
. qtm
is just like qplot
in ggplot2
. I tell it the layer I want to plot and I want to fill the polygons with the different pH values in each
qtm(shp=CRA, fill="pH_water")
tm_shape(CRA.projected)+ tm_fill("pH_water")+tm_borders()+tm_shape(NRCS.projected)+ tm_bubbles("red", size = .1, border.col="black")
The second line of code actually plots 2 layers the first layer is the polygon dataframe and the second one is the point dataframe. You do this by playing with the customization of tmap
typing ?tmap
will help show just how customization tmap
package is.
Hopefully, this tutorial helped you get started with spatial data in R.