Author Archives: begayb

Using LiDAR data to assess forest aesthetics in McDonald-Dunn Forest

Bryan Begay

GEOG566

05/31/2019

Question asked?

How does forest aesthetics vary depending on forest structure as a result of active management that retains vegetation and landforms?

In order to answer my question I would look at two stands in the McDonald-Dunn forest to do some analysis on how their forest structure is related to forest aesthetics. The first stand is Saddleback which had been logged in 1999 and shows signs of management. The second stand was identified near Baker Creek, and is 1 mile west of Saddleback. Baker Creek was chose for its riparian characteristics, as well as having no signs of management activates.

A description of the data set.

The LiDAR data that I used with my initial analysis was 2008 DOGAMI LiDAR flown over the McDonald-Dunn forest. The RMSE for this data set was 1.5cm. The DEM data set I used was from 2009 has a RMSE of 0.3 m. A Canopy Height Model (CHM) was made in RStudio lidR package that used a digital surface model with a 1 meter resolution. The CHM was used to create an individual tree segmentation, where segmented trees were then converted to point data.

https://gimbalmonkey.github.io/SaddlebackPC/index.html

A link to a visualization of the raw point cloud that is georeferenced to its terrain.

Hypotheses: predictions of patterns and processes you looked for.

I suspected that initially the Baker Creek stand would have higher forest aesthetic that would reflect in the stand’s unmanaged vegetation structure.

Rational

Since the Saddleback had been managed and cut I figured the more natural structure of the riparian stand would have generally higher forest aesthetics than a stand that has been altered by anthropogenic factors. Some processes that I hypothesized that relates to forest aesthetics to these stands was the spatial point pattern of trees could be related to forest aesthetics. Insert forest aesthetic link:

Approaches: analysis approaches you used.

Exercise 1: Ripley’s K Analysis point pattern analysis

The steps taken to create a point pattern analysis was to identify individual trees and convert the trees into point data. The RStudio lidR package was used to create a Canopy Height Model and then an Individual tree segmentation. Rasters and shapefiles were create to export the data so I could then use the tree polygons to identify tree points. The spatstat package was used in RStudio as well to perform a Ripley’s K analysis on the point data.

Figure 1. Individual Tree Segmentation using watershed algorithm on Saddleback stand.

Exercise 2: Geographically weighted Regression

The steps taken to do the geographically weighted regression included using the polyongs created from the individual tree segmentation to delineate tree centers. When tree points were created from the centroids of the polygons, which would be inputs for the GWR in ArcMap. A density raster and CHM raster had their data extracted to the point data so that density and tree height could be the variables used in the regression. Tree height was the explanatory variable and density was the independent variable.

Figure 2. Polygon output from Individual tree segmentation using the lidR package. The Watershed algorithm was the means of segmentation, and points were created from polygon centroids in ArcMap.

Exercise 3: Supervised Classification

This analysis involved creating a supervised classification by using training data from NAIP imagery and a maximum likelihood classification algorithm. It involved using the NIR band and creating a false color image that would show the difference spectral reflectance values from conifers and deciduous trees. I used a histogram stretch to visualize the imagery better and spent time gathering quality training data. I then created a confusion matrix by using accuracy points on the training data. I then clipped the thematic map outputs with my individual tree segmentation polygons to show how each tree had their pixels assigned.

Results

The Ripley’s K analysis in ArcMap showed me that Saddleback stand’s trees are dispersed, and the Baker Creek stand’s trees were spatially clustered. GWR outputs told me that the model in the Saddle back stand showed me a map output where tree heights and density were positively related. The adjusted R2 was 0.69 and gave me a good output that showed me the tallest and densest trees were on the edges of Saddleback stand. The Baker Creek stand’s model performed poorly on the point data with an adjusted R2 of 0.5. The outputs only showed relationships could only be modeled on the upper left of the stand. The classified image worked well on Saddleback stand due to less distortion in the NAIP imagery on that stand, and the Baker Creek stand’s classification was not useful since it had significant distortion in the NAIP imagery.

Exercise 1:

Figure 3. ArcMap Ripley’s K function output for Saddleback stand assessing tree points.

Exercise 2:

Figure 4. Geographically weighted regression of Baker Creek and Saddleback stand. The Hotter colors indicate positive relationships between tree density and tree height.

Exercise 3.

Figure 5. Supervised image classification using a maximum likelihood algorithm on Saddleback stand.

What did you learn from your results? How are these results important to science? to resource managers?

I learned that Ripley’s K outputs can differ depending on what packages used. R-studio Ripley’s K outputs told me that both my stands had clustered tree patterning. ArcMap outputs that made more sense told me that my Saddleback stand was actually dispersed. Outputs can be variable if inputs are not explicitly understood or modeled with enough care. I also learned that trying to model a very heterogeneous riparian stand is more difficult because of the variability. This is important for researchers who are interest in riparian areas like Baker Creek since they might need to have more variables to adequately model those stands.

Your learning: what did you learn about software?

I became very familiar with processing and modelling with LiDAR point clouds. I also became familiar with Modelbuilder and learned how to use packages in R like Spatstat. I also found a new method for making a confusion matrix in ArcMap.

What did you learn about statistics or other techniques?

I learned how to do point pattern analysis with Ripley’s K on tree points. This was done in R and in Arc. In Arc using the spatial statistics tool was also something I used and still plan to use. When using GWR I understood what it does, understood the outputs, and learned to properly interpret the results. I also became more concerned with issues of scale and networks that might affect my areas of interest.

Supervised Image classification on forested stands

Question that I asked?

Could I identify functional tree species with supervised image-classification in my stands?

The reason I asked this question was so that If I had to do a geographically weighted regression again it would be valuable to have deciduous or coniferous tree species in my point data for an added variable.

Name of the tool or approach that you used.

The main tool that I used for image classification was the maximum-likelihood classification in ArcMap. I also used the create accuracy assessment points to help create a confusion matrix in excel.

Brief description of steps you followed to complete the analysis.

I downloaded 2016 NAIP imagery in my area of interest and used the high resolution imagery to create a false colored image with the bands being arranged as NIR, Red, and Green. To help delineate broad-leaf vegetation from coniferous vegetation, I applied a histogram equalize stretch that enhanced my ability to identity conifers in the landscape. From there I created a maximum likelihood classification by drawing training data polygons on the false color imagery, which involved me using the Esri digital imagery base map as a reference image.

Once the image classification was complete, I used the create accuracy points on my stand and then extracted the raster values from the thematic map output to those points to create a confusion matrix in Excel. I clipped the thematic map raster to the watershed polygons I made when I did an individual tree segmentation to show what pixel classifications were assigned in my tree tops.

Brief description of results you obtained.

The thematic map output was 83% accurate with conifer and developed land covers performing the worst in the model. The developed land cover is generally difficult to model in a landscape, and the variability in urban spectral reflectance leads to errors in modeling. The conifer land cover performed more poorly due to my trouble achieving accurate training data with the imagery resolution, and also with the model having trouble delineating conifers from grass and deciduous vegetation. Errors of commission on my part (65% accuracy), and errors of omission (75% accuracy) lead to the lower accuracy of the conifer land cover (Table 1). Despite these errors, the thematic map output performed well, and the land cover pixels in my stands showed that conifer trees were accurately assigned in the Saddleback stand (Figure 2). For the baker creek stand the large amount of shadows, sun glare on canopies, and classification cut off, lead to a poor classification of that stand.

Figure 1. The land cover thematic map for the entire NAIP image. The cyan blue color indicates the locations of Saddleback and Baker Creek stands.

Figure 2.The land cover classification output for the Saddleback stand.

Figure 3. The land cover classified output for Baker Creek Stand. Note that the NAIP imagery that was classified did not extend to cover the entire stand. The tree crown polygons were laid below the output to show where the land cover cuts off.

Table 1. Confusion matrix for the thematic map output.

Critique of the method – what was useful, what was not?

Some critiques about this process was that it was time consuming to create training data detailed enough to capture the variation in the scene for my desired accuracy. Sources of errors in the thematic map include shadows, resolution and variable spectral response signatures in the remotely sensed vegetation. Shadows occluded trees that would otherwise stand out, and distorted the classification enough for me to have to add in a land cover classification for shadows to mask them out of the scene. The issue of resolution just means that NAIP imagery was not detailed enough for the applications I asked. Imagery taken from unmanned aerial drones may be a potential avenue for acquiring a more higher resolution data set. The confusion matrix highlights this issue, with an omission error of 65% for conifers and 75% commission error. It was difficult to determine conifer trees accurately in the training data from the variability of the spectral reflectance and the blurred crowns from the 1 meter resolution.

Since I only did a classification, I didn’t attempt to classify tree functional species to my tree polygons. The process that comes to mind on how to do that is to visibly determine which classification color is more pronounced in a tree top, and then placing that species in the point data as an attribute. This process would be highly time consuming and developing a methodology to streamline the classification of functional tree species to my tree points would be potential future work.Overall, the thematic map outputs are useful for areas like the Saddleback stand that have less shadows and distortion. The map is less useful for areas with high distortion like my Baker Creek stand.

Exercise 2: Geographically weighted regression on two forested stands.

Bryan Begay

Initial Spatial Question: How does the spatial arrangement of trees relate to forest aesthetics in my areas of interest?

Context:

To understand forest aesthetics in my stand called Saddleback, I did a Ripley’s K analysis for Saddleback and on a riparian stand called Baker Creek to determine if the stands are clustered or dispersed. The Baker Creek location is a mile west of the Saddleback stand.

Geographically weighted Regression:

I performed a geographically weighted regression on both the Saddleback and the Baker Creek stands. The dependent variable was a density raster value and the explanatory value was tree height.

Tools and Workflow

Figure 1. The workflow for creating the Geographically Weighted Regression for the Saddleback Stand. The Baker Creek stand followed the same workflow as well.

Results:

Figure 2. Geographically Weighted Regression showing the explanatory variable coefficients in the Saddleback and Baker Creek stands near Corvallis Oregon. Yellow color indicates negative relationships and the hotter colors indicate positive relationships between tree height and density.

Figure 3. Geographically Weighted Regression showing the Local R2 values in the Saddleback and Baker Creek stands near Corvallis Oregon. Yellow color indicates that the local model is performing poorly, while hotter colors indicate better performance locally.

Table 1. Summary table output for the Saddleback stand’s geographically weighted regression.

Table 2. Summary table output for the Back Creek stand’s geographically weighted regression.

4. Interpretation/Discussion:

Having done the Ripley’s K analysis, I wanted to have a connection with this exercise, so I created a point density raster on both my stands (Figure 1). The point density raster calculates a magnitude-per-unit area from my tree points and outputs a density for the neighborhood around each tree point. The raster values would then be a descriptor of the trees neighborhood density. Having the density neighborhood values describes the stands tree spatial arraignment and relates to the Ripley’s K analysis outputs of telling if a stand is spatially clustered or dispersed.

Figure 2. shows that there is a spatial pattern in the Saddleback stand between density and height. There is a positive relationship on the edges of the stand and a decreasing relationship in the middle of stand between the two variables. This makes sense when thinking about how the stand would have denser and higher trees on the edges of the managed stand to screen the forest operations. The coefficient values for the baker creek showed a positive relationship on the north eastern portion of the stand, which would need further investigation to understand the relationship between density and height. Overall the relationship was negative in the Baker creek stand between density and height, but this may be attributed to the low local R2 values that indicate poor modeling (figure 3). Table 2. also shows that the Baker Creek model only accounted for 50% of the variance for the adjusted R2 values, which would indicate that more variables would be needed for the riparian stand. Figure 1. shows the summary table for GWR in the Saddleback stand.

Critiques

The critiques for this exercise is that I only look at height and density. If I had more knowledge of working with LAS data sets I would have liked to have implemented the return values on the LiDAR data as an indicator of density. Another critique would be that I used density as a dependent variable and height as an explanatory variable. Using density as the dependent value allows me to see the spatial patterning of my trees when plotted in ArcMap so I can reference the Ripley’s K outputs for further analysis. Having height as a response variable with density as an explanatory is something that would have been easier for me understand and explain that relationship. Density can affect tree height in a stand but understanding tree height as a factor that affects density is not as intuitive. Looking at how tree height responds to density in my stand would tell something about tree height, but that relationship has already been explored in great depth.

Ripley’s K analysis on two forested stands

Bryan Begay

Question asked

Can a point pattern analysis help describe the relationship between the spatial pattern of trees in my area of interest with forest aesthetics? More specifically, how does Ripley’s K function describe forest aesthetics in different parts of the forest on the McDonald-Dunn Forest.

Tool

The main tool that I used for the point pattern analysis was the Ripley’s K function.

Steps taken for analysis

My workflow involved doing preprocessing of the LiDAR data, then creating a canopy height model to obtain an individual tree segmentation. The individual tree segmentation would then allow me to extract tree points with coordinates that could be usable points for the Ripley’s K Function.

LiDAR preprocessing

I started off with finding my harvest unit area of interest (Saddleback stand) and finding a nearby riparian stand that would be used to compare the Ripley’s K function outputs. I create polygons to clip the LiDAR point clouds onto. I found the LiDAR files that were over the AOIs and used the ArcMap Create LAS Dataset (Data Management) to make workable files, then clipped the data sets to the polygons using the Extract LAS (3D analyst) tool. Fusion was used to merge the clipped LiDAR tiles to make one continuous data set for both AOIs. Then I normalized the point cloud with FUSION by using a 2008 DEM raster from DOGAMI, and the FUSION tools ASCII2DTM and Clipdata.

CHM, tree segmentation, and Ripley’s K

With the normalized point cloud, a canopy height model (CHM) was created in R-studio, and then an individual tree segmentation was made with an R package called lidR by using a watershed algorithm. The segmented trees were exported as a polygon shapefile that could be used in ArcMap. The Feature to Point tool (Data Management) was used to calculate the centroid of the polygons to identify individual trees as points. The points could then be used in RStudio spatstat package to be used in a Ripley’s K Function. The function was calculated for both Saddleback stand and a nearby riparian area.

Results

The results show that the pattern for both the Saddleback stand and the riparian area were clustered. Both stands observed lines were plotted above the expected line for a random spatial pattern. The lines were significantly different, being above the higher confidence envelope. The riparian stand has higher levels of clustering compared to Saddleback stand. The Saddleback stand showed a plotted clustering pattern as well, but not to the degree of the riparian stand.

Critiques

Some critiques for my analysis would be to use a more robust individual tree segmentation algorithm analysis. For the sake of processing speed and creating delineated polygons with reduced noise, I used a resolution of 1 meter for my CHM. The 1 meter resolution for my CHM smoothed over the tree segmentation, possibly removing potential tree polygons but creating more defined segmented trees. The CHM lower resolution was used with a relatively simple watershed algorithm. Past algorithms I’ve used showed better results than watershed but required more detailed inputs. Another criticism I have is that using the feature to point does not necessarily give me the tree tops, but finds the centers of polygons that the tree segmentation identified as individual trees. Finding a more robust method for determining tree points would be more preferable.

A UAS and LiDAR based approach to maximizing forest aesthetics in a timber harvest

Bryan Begay

Research Question:

Can LiDAR derived from an Unmanned Aerial System (UAS) create a point cloud driven visualization model for maximizing forest aesthetics in a highly visible timber harvest?

Context

A variable retention thinning is planned to be implemented in a harvest unit on the McDonald-Dunn Forest in a visible area near Corvallis. UAS systems offer an efficient way to collect data over large areas to create high quality data sets from LiDAR that can capture the structure of a forest stand. There is a need for a model/methodology that utilizes UAS LiDAR point clouds to generate a visualization model to create a timber harvest in an areas with high visibility that maximize forest aesthetics. Inputs for the model include DTMs, Google Earth Pro view shed tool, and point clouds. The point clouds can be manipulated to visualize an optimal silvicultural prescription that maximizes forest landscape aesthetics. Ancillary data of view shed and terrain from DTMs are inputs expected to help create a visualization model.

A description of the data set you will be analyzing, including the spatial and temporal resolution and extent:

The data set I will be using will include high resolution LiDAR point clouds of a stand, Digital Terrain Models (DTM) from LiDAR point clouds flown by the USFS previously, and additional ancillary data from Google Earth Pro. The Google Earth Pro data will use the view shed tool for assessing the visual impact of regions in the harvesting unit. The spatial resolution will be using high resolution LiDAR point clouds on an area that is a few square kilometers. The temporal resolution will span data acquisition before the harvest, and then an assessment of the computer based prescription after harvest. The temporal resolution of the point cloud collected from the UAV will be collected in a discrete time frame of one day. The DTM data set and google earth pro data sets will be variable, but I anticipate them to be newer high resolution Google Earth imagery and high resolution LiDAR data sets.

Hypotheses:

I hypothesizes that LiDAR point clouds can be used in a visualization model to create a silvicultural prescription in a timber harvest that maximizes forest aesthetics in a logged area . Google Earth Pro view shed tool, high quality LiDAR point clouds, and a large body of literature on forest aesthetics provide a data set that is very rich in inputs to create a visualization model for timber harvests that maximizes forest aesthetics.

Approaches:

I would like to do some sort of analysis looking at the spatial relationship between forest aesthetics and timber harvests. A part of this analysis would look at the relationship of the spatial pattern of residual structure left from the thinning and the landscape aesthetics.

Expected outcome:

I would like an expected outcome to be a visualization model of the harvest unit that utilizes view shed and point clouds that maximizes forest aesthetics in a high viewership area.

Significance:

This spatial problem is important to the profession of forestry as well as other land managers, since it helps maintain the social license for foresters to practice forestry in areas that are highly visible. Public acceptance of harvesting practices is increased when forest aesthetics is taken into account, so creating a methodology and model to assist in creating silvicultural prescriptions that increase forest aesthetics is critical for public acceptance of forestry.

Level of preparation:

A. I have experience in ArcGIS.

B. No experience in modelbuilder and Python programming in GIS.

C. Some experience in R.

D. Experience in Digital Image Processing.

E. I’ve used Google Earth Engine and very little experience with MATLAB.

GEOG 566

Advanced spatial statistics and GIScience

Author Archives: begayb

Using LiDAR data to assess forest aesthetics in McDonald-Dunn Forest

Supervised Image classification on forested stands

Exercise 2: Geographically weighted regression on two forested stands.

Ripley’s K analysis on two forested stands

A UAS and LiDAR based approach to maximizing forest aesthetics in a timber harvest