Tag Archives: point pattern analysis

Ripley’s K analysis on two forested stands

Bryan Begay

  1. Question asked

Can a point pattern analysis help describe the relationship between the spatial pattern of trees in my area of interest with forest aesthetics? More specifically, how does Ripley’s K function describe forest aesthetics in different parts of the forest on the McDonald-Dunn Forest.

  1. Tool

The main tool that I used for the point pattern analysis was the Ripley’s K function.

  1. Steps taken for analysis

My workflow involved doing preprocessing of the LiDAR data, then creating a canopy height model to obtain an individual tree segmentation. The individual tree segmentation would then allow me to extract tree points with coordinates that could be usable points for the Ripley’s K Function.

 

LiDAR preprocessing

I started off with finding my harvest unit area of interest (Saddleback stand) and finding a nearby riparian stand that would be used to compare the Ripley’s K function outputs.   I create polygons to clip the LiDAR point clouds onto. I found the LiDAR files that were over the AOIs and used the ArcMap Create LAS Dataset (Data Management) to make workable files, then clipped the data sets to the polygons using the Extract LAS (3D analyst) tool. Fusion was used to merge the clipped LiDAR tiles to make one continuous data set for both AOIs. Then I normalized the point cloud with FUSION by using a 2008 DEM raster from DOGAMI, and the FUSION tools ASCII2DTM and Clipdata.

 

CHM, tree segmentation, and Ripley’s K

With the normalized point cloud, a canopy height model (CHM) was created in R-studio, and then an individual tree segmentation was made with an R package called lidR by using a watershed algorithm. The segmented trees were exported as a polygon shapefile that could be used in ArcMap. The Feature to Point tool (Data Management) was used to calculate the centroid of the polygons to identify individual trees as points. The points could then be used in RStudio spatstat package to be used in a Ripley’s K Function. The function was calculated for both Saddleback stand and a nearby riparian area.

  1. Results

The results show that the pattern for both the Saddleback stand and the riparian area were clustered. Both stands observed lines were plotted above the expected line for a random spatial pattern.  The lines were significantly different, being above the higher confidence envelope. The riparian stand has higher levels of clustering compared to Saddleback stand. The Saddleback stand showed a plotted clustering pattern as well, but not to the degree of the riparian stand.

 

  1. Critiques

Some critiques for my analysis would be to use a more robust individual tree segmentation algorithm analysis. For the sake of processing speed and creating delineated polygons with reduced noise, I used a resolution of 1 meter for my CHM. The 1 meter resolution for my CHM smoothed over the tree segmentation, possibly removing potential tree polygons but creating more defined segmented trees. The CHM lower resolution was used with a relatively simple watershed algorithm. Past algorithms I’ve used showed better results than watershed but required more detailed inputs. Another criticism I have is that using the feature to point does not necessarily give me the tree tops, but finds the centers of polygons that the tree segmentation identified as individual trees. Finding a more robust method for determining tree points would be more preferable.

Exercise 1: Preparing for Point Pattern Analysis

Exercise 1

The Question in Context

In order to answer my question: are the dolphin sighting data points clustered along the transect surveys or do they have an equal distribution pattern? I need to use point pattern analysis. I am trying visualize where in space dolphins were sighted along the coast of California, specifically from my San Diego sighting area. In this exercise, the variable of interest is dolphin sightings. These are x,y coordinates (point data) indicating the presence of common bottlenose dolphins along a transect. However, these transect data were not recorded and I needed to recreate these lines to my best abilities. This process is more challenging than anticipated, but will prove useful in the short-term view of this class and project and long-term in management ramifications.

The Tools

As part of this exercise, I used ArcMap 10.6, GoogleEarth, qGIS, and Excel. Although I was only intending on importing my Excel data, saved as a .csv file into ArcMap, that was not working, so other tools were necessary. The final goal of this exercise was to complete point-pattern analyses comparing distance along recreated transects to sightings. From there, the sightings would be broken down by year, season, or environmental factor (El Niño versus La Niña years) to look for distributing patterns, specifically if the points were ever clustered or equally distributed at different points in time.

Steps/Outputs/Review of Methods and Analysis

My first step was to clean up my sightings data enough that it could be exported as a .csv and imported as x-y data into ArcMap. However, ArcMap, no matter the transformation equation, seemed to understand the projected or geographic coordinate systems. After many attempts, where my data ended up along the east coast of Africa or in the Gulf of Mexico, I tried a work around; I imported the .csv file into qGIS with the help of a classmate, and then exported that file as a shape file. Then, I was able to import that shape file into ArcMap and select the correct geographic and projected coordinate systems. The points finally appeared off the coast of California.

I then found a shape file of North America with a more accurate coastline, to add to the base map. This step will be important later when I add in track lines, and how the distributions of points along these track lines are related to bathymetry. The bathymetric lines will need to be rasterized and later interpolated.

The next step was the track line recreation. I chose to focus on the San Diego study site. This site has the most data and the most consistently and standardly collected data. The surveys always left the same port of Mission Bay, San Diego, CA traveled north at 5-10km/hr to a specific beach (landmark), then turned around. It is noted on sighting data whether the track line was surveyed on both directions (South to North and North to South), or unidirectional (South to North). Because some data were collected prior to the invention of a GPS and the commercial availability, I have to recreate these track lines. I started trying to use ArcMap to draw the lines but had difficulty. Luckily, after many attempts, it was suggested that I use Google Earth. Here I found a tool to create a survey line where I can mark the edges along the coastline at an approximate distance from shore, and then export that file. It took a while to realize that the file needed to be exported as a .kml and not a .kmz.

Once exported as a .kml, I was able to convert the .kml file to a layer file and then to a shape file in ArcMap. The next step in this is somehow getting all points within one kilometer of the track line (my spatial scale for this part of the project) to associate with that track line. One idea was snapping the points to the line. However, this did not work. I am still stuck here: the major step before I can have my point data with an association to the line and then begin a point pattern analysis in ArcMap and/or R Studio.

Results

Although I do not currently have results of this exercise, fully. I can say for certain, that it has not been without trying, nor am I stopping. I have been brainstorming and milking resources from classmates and teaching assistants about how to associate the sighting data points with the track line to then do this cluster analysis. Hopefully, based on this can be exported to R studio where I can see distributions along the transect. I may be able to do a density-based analysis which would show if different sections along the transect, which I would need to designate and potentially rasterize first, have different densities of points. I would expect the sections to differ seasonally.

Critiques

Although I add in my opinions on usefulness and ease above, I do believe this will be very helpful in analyzing distribution patterns. Right now, it is largely unknown if there are differences in distribution patterns for this population because they move rapidly and at great distances. But, by investigating data from only the San Diego site, I can determine if there are differences in distributions along the transects temporally and spatially. In addition, the total counts of sightings in each location per unit effort will be useful to see the influx to that entire survey area over time.


Contact information: this post was written by Alexa Kownacki, Wildlife Science Ph.D. Student at Oregon State University. Twitter: @lexaKownacki