Tag Archives: forest pathology

Final project: Washing out the (black) stain and ringing out the details

BACKGROUND

In order to explain my project, especially my hypotheses, some background information about this disease is necessary. Black stain root disease of Douglas-fir is caused by the fungus Leptographium wageneri. It infects the roots of its Douglas-fir host, growing in the xylem and cutting the tree off from water. It spreads between adjacent trees via growth through root contacts and grafts and long-distance via insects (vectors) that feed and breed in roots and stumps and carry fungal spores to new hosts.

Forest management practices influence the spread of disease because of the influence on (i) the distance between trees (determined by natural or planted tree densities); (ii) adjacency of susceptible species (as in single-species Douglas-fir plantations); (iii) road, thinning and harvest disturbance, which create suitable habitat for insect vectors (stumps, dead trees) and stress remaining live trees, attracting insect vectors; and (iv) forest age distributions, because rotation lengths determine the age structure in managed forest landscapes and younger trees (<30-40 years old) are thought to be more susceptible to infection and mortality by the disease.

RESEARCH QUESTION

How do (B) spatial patterns of forest management practices relate to (A) spatial patterns of black stain root disease (BSRD) infection probabilities at the stand and landscape scale via (C) the spatial configuration and connectivity of susceptible stands to infection?

In order to address my research questions, I built a spatial model to simulate BSRD spread in forest landscapes using the agent-based modeling software NetLogo (Wilensky 1999). I used Exercises 1-3 to focus on the spatial patterns of forest management classes. Landscapes were equivalent in terms of the proportion of each management class and number of stands, varying only in spatial pattern of management classes. In the exercises, I evaluated the relationship between management and disease by simulating disease spread in landscapes with two distinct spatial patterns of management:

  • Clustered landscape: The landscape was evenly divided into three blocks, one for each management class. Each block was evenly divided into stands.
  • Random landscape: The landscape was evenly divided into stands, and forest management classes were randomly assigned to each stand.

MY DATA

I analyzed outputs of my spatial model. The raster files contain the states of cells in forest landscapes at a given time step during one model run. States tracked include management class, stand ID number, presence/absence of trees, tree age, probability of infection, and infection status (infected/not infected). Management class and stand ID did not change during the model run. I analyzed tree states from the last step of the model run and did not analyze change over time.

Extent: ~20 hectares (much smaller than my full models runs will be)

Spatial resolution: ~1.524 x 1.524 m cells (maximum 1 tree per cell)

Three contrasting and realistic forest management classes for the Pacific Northwest were present in the landscapes analyzed:

  • Intensive – Active management: 15-foot spacing, no thinning, harvest at 37 years.
  • Extensive – Active management: 10-foot spacing, one pre-commercial and two commercial thinnings, harvest at 80 years.
  • Set-aside/old-growth (OG) – No active management: Forest with Douglas-fir in Pacific Northwest old-growth densities and age distributions and uneven spacing with no thinning or harvest.

HYPOTHESES: PREDICTIONS OF PATTERNS AND PROCESSES I LOOKED FOR

Because forest management practices influence the spread of disease as described in the “Background” section above, I hypothesized that the spatial patterns of forest management practices would influence the spatial pattern of disease. Having changed my experimental design and learned about spatial statistics and analysis methods throughout the course, I hypothesize that…

  • The “clustered” landscape will have (i) higher absolute values of infection probabilities, (ii) higher spatial autocorrelation in infection probabilities, and (iii) larger infection centers (“hotspots” of infection probabilities) than the “random” landscape because clustering of similarly managed forest stands creates continuous, connected areas of forest managed in a manner that creates suitable vector and pathogen habitat and facilitates the spread of disease (higher planting densities, lower age, frequent thinning and harvest disturbance in the intensive and extensive management). I therefore predict that:
    • Intensive and extensive stands will have the highest infection probabilities with large infection centers (“hotspots”) that extend beyond stand boundaries.
      • Spatial autocorrelation will therefore be higher and exhibit a lower rate of decrease with increasing distance because there will be larger clusters of high and low infection probabilities when stands with similar management are clustered.
    • Set-aside (old-growth, OG) stands will have the lowest infection probabilities, with small infection centers that may or may not extend beyond stand boundaries.
      • Where old-growth stands are in contact with intensive or extensive stands, neighborhood effects (and edge effects) will increase infection probabilities in those OG stands.
    • In contrast, the “random” landscape will have (i) lower absolute values of infection probabilities, (ii) less spatial autocorrelation in infection probabilities, and (iii) smaller infection centers than the “clustered” landscape. This is because the random landscape will have less continuity and connectivity between similarly managed forest stands; stands with management that facilitates disease spread will be less connected and stands with management that does not facilitate the spread of disease will also be less connected. I would predict that:
      • Intensive and extensive stands will still have the highest infection probabilities, but that the spread of infection will be limited at the boundaries with low-susceptibility old-growth stands.
        • Because of the boundaries created by the spatial arrangement of low-susceptibility old-growth stands, clusters of similar infection probabilities will be smaller and values of spatial autocorrelation will be lower and decrease more rapidly with increasing lag distance. At the same time, old-growth stands may have higher infection probabilities in the random landscape than in the clustered landscape because they would be more likely to be in contact with high-susceptibility intensive and extensive stands.
      • I also hypothesize that each stand’s neighborhood and spatial position relative to stands of similar or different management will influence that stand’s infection probabilities because of the difference in spread rates between management classes and the level of connectivity to high- and low-susceptibility stands based on the spatial distribution of management classes.
        • Stands with a large proportion of high-susceptibility neighboring stands (e.g., extensive or intensive management) will have higher infection probabilities than similarly managed stands with a small proportion of high-susceptibility neighboring stands.
        • High infection probabilities will be concentrated in intensive and extensive stands that have high levels of connectivity within their management class networks because high connectivity will allow for the rapid spread of the disease to those stands. In other words, the more connected you are to high-susceptibility stands, the higher your probability of infection.

APPROACHES: ANALYSIS APPROACHES I USED

Ex. 1: Correlogram, Global Moran’s I statistic

In order to test whether similar infection probability values were spatially clustered, I used the raster package in R (Hijmans 2019) to calculate the global Moran’s I statistic at multiple lag distances for both of the landscape patterns. I then plotted global Moran’s I vs. distance to create a correlogram and compared my results between landscapes.

Ex. 2: Hotspot analyses (ArcMap), Neighborhood analyses (ArcMap)

First, I performed a non-spatial analysis comparing infection probabilities between (i) landscape patterns (ii) management classes, and (iii) management classes in each of the landscapes. Then, I used the Hotspot Analysis (Getis-Ord Gi*) tool in ArcMap to identify statistically significant hot- and cold-spots of high and low infection probabilities, respectively. I selected points within hot and cold spots and used the Multiple Ring Buffer tool in ArcMap to create distance rings, which I intersected with the management classes to perform a neighborhood analysis. This neighborhood analysis revealed how the proportion of each management class changed with increasing distance from hotspots in order to test whether the management “neighborhood” of trees influenced their probability of infection.

Ex. 3: Network and landscape connectivity analyses (Conefor)

I divided my landscape into three separate stand networks based on their management class. Then, I used the free landscape connectivity software Conefor (Saura and Torné 2009) to analyze the connectivity of each stand based on its position within and role in connecting the network using the Integrative Index of Connectivity (Saura and Rubio 2010). I then assessed the relationship between the connectivity of each stand and infection probabilities of trees within that stand using various summary statistics (e.g., mean, median) to test whether connectivity was related to infection probability.

RESULTS: WHAT DID I PRODUCE?

As my model had not been parameterized by the beginning of this term, I analyzed “dummy” data, where infection spread probabilities were calculated as a decreasing linear function of distance from infected trees. However, the results I produced still provided insights as to the general functioning of the model and factors that will likely influence my results in the full, parameterized model.

I produced both maps and numerical/statistical relationships that describe the patterns of “A” (infection probabilities), the relationship between “A” and “B” (forest management classes), and how/whether “A” and “B” are related via “C” (landscape connectivity and stand networks).

In Exercise 1, I found evidence to support my hypothesis of spatial autocorrelation at small scales in both landscapes and higher autocorrelation and slower decay with distance in the clustered landscape than the random landscape. This was expected because the design of the model calculated probability of infection for each tree as a function of distance from infected trees.

In Exercises 2 and 3, I found little to no evidence to support the hypothesis that either connectivity or neighboring stand management had significant influence on infection probabilities. Because the model that produced the “dummy” data limited infection to ~35 meters from infected trees and harvest and thinning attraction had not been integrated into infection calculations, this result was not surprising. In my full model where spread via insect vectors could span >1,000 m, I expect to see a larger influence of connectivity and neighborhood on infection probabilities.

A critical component of model testing is exploring the “parameter space”, including a range of possible values for each parameter. This is especially for agent-based models where there are complex interactions between many individuals that result in larger-scale patterns that may be emergent and not fully predictable by the simple sum of the parts. In my model, the disease parameters of interest are the factors influencing probability of infection (Fig. 1). In order to understand how the model reacts to changes in those parameters, I will perform a sensitivity analysis, systematically adjusting parameter values one-by-one and comparing the results of each series of model runs under each set of parameter values.

Fig.1. Two of the model parameters that will be systematically adjusted during sensitivity analysis. Tree susceptibility to infection as a function of age (left) and probability of root contact as a function of distance (right) will both likely influence model behavior and the relative levels of infection probability between the three management classes.

This is especially relevant given that in Exercises 1 through 3, I found that the extensively managed plantations had the highest values of infection probability and most of the infection hotspots, likely due to the fact that this management class has the highest [initial] density of trees. For the complete model, I am hypothesizing that the intensive plantations will have the highest infection probabilities because of high frequency of insect-attracting harvest and short rotations that maintain the trees in an age class highly susceptible to infection. In the full model, the extensive plantations will have higher initial density than the intensive plantations but will undergo multiple thinnings, decreasing tree density but attracting vectors, and will be harvested at age 80, thus allowing trees to grow into a less susceptible age class. In this final model, thinning, harvest length, and vector attraction will factor in to the calculation of infection probabilities. My analysis made it clear that even a 1.5 meter difference in spacing resulted in a statistically significant difference for disease transmission, with much higher disease spread in the denser forest. Because the model is highly sensitive to tree spacing, likely because the parameters of my model that relate to distance drop off in sigmoidal or exponential decay patterns, I would hypothesize that changes in the values of parameters that influence the spatial spread of disease (i.e., insect dispersal distance, probability of root contact with distance) and the magnitude of vector attraction after harvest and thinning will determine whether the “extensive” or “intensive” forest management class will ultimately the highest levels of infection probabilities. In addition, the rate of decay of root contact and insect dispersal probabilities will determine whether management and infection within stands influence infection in neighboring stands and the distance and strength of those neighborhood effects. I would like to test this my performing such analyses on the outputs from my sensitivity analyses.

SIGNIFICANCE: WHAT DID I LEARN FROM MY RESULTS? HOW ARE THESE RESULTS IMPORTANT TO SCIENCE? TO RESOURCE MANAGERS?

Ultimately, the significance of this research is to understand the potential threat of black stain root disease in the Pacific Northwest and inform management practices by identifying evidence-based, landscape-scale management strategies that could mitigate BSRD disease issues. While the results of Exercises 1-3 were interesting, they were produced using a model that had not been fully parameterized and thus are not representative of the likely actual model outcomes. Therefore, I was not able to test my hypotheses. That said, this course allowed me to design and develop an analysis to test my hypotheses. The exercises I completed have also provided a deeper understanding of how my model works. Through this process, I have begun to generate additional testable hypotheses regarding model sensitivity to parameters and the relative spread rates of infection in each of the forest management classes. Another key takeaway is the importance of producing many runs with the same landscape configuration and parameter settings to account for stochastic processes in the model. By only analyzing one run for each scenario, there is a chance that the results are not representative of the average behavior of the system or the full range of behaviors possible for those scenarios. For example, with the random landscape configuration, one generated landscape can be highly connected and the next highly fragmented with respect to intensive plantations, and only a series of runs under the same conditions would provide reliable results for interpretation.

WHAT I LEARNED ABOUT… SOFTWARE

(a, b) Arc-Info, Modelbuilder and/or GIS programming in Python

This was my first opportunity to perform statistical analysis in ArcGIS, and I used multiple new tools, including hotspot analysis, multiple ring buffers, and using extensions. Though I did not use Python or Modelbuilder, I realized that doing so will be critical for automating my analyses given the large number of model runs I will be analyzing. While I learned how to program in Python using arcpy in GEOG 562, I used this course to choose the appropriate tools and analyses for my questions and hypotheses rather than automating procedures I may not use again. I would now like to implement my procedures for neighborhood analysis in Python in order to automate and increase the efficiency of my workflow.

(c) Spatial analysis in R

During this course, I learned most about spatial data manipulation in R, since I had limited experience using R with spatial data beforehand. I used R for spatial statistics, data cleaning and management, and conversion between vector and raster data. I also learned about the limitations of R (and my personal limitations) in terms of the challenge of learning how to use packages and their functions when documentation is variable in quality and a wide variety of user-generated packages are available with little reference as to their quality and reliability. For example, for Exercise 2, I had trouble finding an up-to-date and high-quality package for hotspot analysis in R, with raster data or otherwise. However, this method was straightforward in ArcMap once the data were converted from raster to points. For Exercise 1, the only Moran’s I calculation that I was able to perform with my raster data was the “moran” function in the raster package, which does not provide z- or p-values to evaluate the statistical significance of the calculated Moran’s I and requires you to generate your own weights matrices, which is a pain. Using the spdep or ncf packages for this analysis was incredibly slow (though I am not sure why), and the learning curve for spatstat was too steep for me to overcome by the Exercise 1 deadline (but I hope to return to this package in the future).

Reading, manipulating, and converting data: With some trial and error and research into the packages available for working with spatial data in R (especially raster, sp/spdep, and sf), I learned how to quickly and easily convert data between raster and shapefile formats, which was very useful in automating the cleaning and preparation for multiple datasets and creating the inputs for the analyses I want to perform.

(d) Landscape connectivity analyses: I learned that there are a wide variety of metrics available through Fragstats (and landscapemetrics and landscapetools packages in R), however, I was not able to perform my desired stand-scale analysis of connectivity because I could not determine whether it is possible to analyze contiguous stands with the same management class as separate patches (Fragstats considered all contiguous cells in the raster with the same class to be part of the same patch). Instead, I used Conefor, which has an ArcMap extension that allows you to generate a node and connection file from a polygon shapefile, to calculate relatively few but robust and ecologically meaningful connectivity metrics for the stands in my landscape.

WHAT I LEARNED ABOUT… SPATIAL STATISTICS

Correlograms and Moran’s I: For this statistical method, I learned the importance of choosing meaningful lag distances based on the data being analyzed and the process being examined. For example, my correlogram consists of a lot of “noise” with many peaks and troughs due to the empty cells between trees, but I also captured data at the relevant distances. Failure to choose appropriate lag distances means that some autocorrelation could be missed, but analyses of large raster images at a high resolution of lag distances results in very slow processing. In addition, I wanted to compare local vs. global Moran’s I to determine whether infections were sequestered to certain portions of the landscape or spread throughout the entire landscape, but the function for local Moran’s I returned values far outside the -1 to 1 range of the global Moran’s I. As a result, I did not understand how to interpret or compare these values. In addition, global Moran’s I did not tell me where spatial autocorrelation was happening, but the fact that there was spatial autocorrelation led me to perform a…

Hotspot analysis (Getis-Ord Gi*): It became clear that deep conceptual understanding of hypothesized spatial relationships and processes in the data and a clear hypothesis are critical for hotspot analysis. I performed multiple analyses with difference distance weighting to compare the results, and there was a large variation in both the number of points included in hot and cold spots and the landscape area covered by those spots between the different weighting and distance methods. I ended up choosing the inverse squared distance weighting based on my understanding of root transmission and vector dispersal probabilities and because this weighting method was the most conservative (produced the smallest hotspots). The confidence level chosen also resulted in large variation in the size of hotspots. After confirming that there was spatial autocorrelation in infection probabilities, using this method helped me to understand where in the landscape these patterns were occurring and thus how they related to management practices.

Neighborhood analysis: I did not find this method provided much insight in my case, not because of the method itself but because of my data (it just confirmed the landscape pattern that I had designed, clustered vs. random) and my approach (one hotspot and one coldspot point non-randomly selected in each landscape. I also found this method to be tedious in ArcMap, though I would like to automate it, and I later learned about the zonal statistics tool, which can help make this analysis more efficient. In general, it is not clear what statistics I could have used to confirm whether results were significantly different between landscapes, but perhaps this is an issue caused by my own ignorance.

Network/landscape connectivity analyses: I learned that there are a wide variety of tools, programs, and metrics available for these types of analyses. I found the Integrative Index of Connectivity (implemented in Conefor) particularly interesting because of the way it categorizes habitat patches based on multiple attributes in addition to their spatial and topological positions in the landscape. The documentation for this metric is thorough, its ecological significance has been supported in peer-reviewed publications (Saura and Rubio 2010), and it is relatively easy to interpret. In contrast, I found the number of metrics available in Fragstats to be overwhelming especially during the data exploration phase.

REFERENCES

Robert J. Hijmans (2019). raster: Geographic Data Analysis and Modeling. R package version 2.8-19. https://CRAN.R-project.org/package=raster

Saura, S. & J. Torné. 2009. Conefor Sensinode 2.2: a software package for quantifying the importance of habitat patches for landscape connectivity. Environmental Modelling & Software 24: 135-139.

Saura, S. & L. Rubio. 2010. A common currency for the different ways in which patches and links can contribute to habitat availability and connectivity in the landscape. Ecography 33: 523-537.

Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.

Ex. 3: Does black stain spread through landscape networks?

BACKGROUND

For those who have not seen my previous posts, my research involves building a model to simulate the spread of black stain root disease (a disease affecting Douglas-fir trees) in different landscape management scenarios. Each of my landscapes are made up of stands assigned one of three forest management classes. These classes determine the age structure, density, thinning, and harvest of the stands, factors that influence probability of infection.

 QUESTION

Are spatial patterns of infection probabilities for black stain root disease related to spatial patterns of forest management practices via the connectivity structure of the network of stands in my landscape?

TOOLS AND APPROACH

 I decided to look at how landscape connectivity influenced the spatial relationship between forest management practices and infection probabilities. This approach builds off of a network approach based in graph theory (where each component of the landscape is a “node” with “edges” connecting them) and incorporates concepts from landscape ecology regarding distance-dependent ecological processes and the importance of patch characteristics (e.g., area, habitat quality) in the contribution of patches to the connectivity of the landscape. I used ArcMap, R, and a free software called Conefor (Saura and Torné 2009) to perform my analysis.

 DESCRIPTION OF STEPS I FOLLOWED TO COMPLETE THE ANALYSIS

 1. Create a mosaic of the landscape

The landscape in my disease spread model is a torus (left and right sides connected, top and bottom are connected). The raster outputs from my model with stand ID numbers and management classes do not account for this and are represented as a square. Thus, in order to fully consider the connectivity of each stand in the landscape, I needed to tile the landscape in a 3 x 3 grid so that each stand at the edge of the stand map would have the correct spatial position relative to its neighbors beyond the original raster boundary. I did this in R by making copies of the stand ID raster and adjusting their extent. In ArcMap, I then assigned the management classes to each of those stands, converting to polygon, using the “Identity” tool with the polygon for management class, and then using the “Join Field” tool so that every stand with the same unique ID number would have the relevant management class assigned. If I had not done this step, then the position of stands at the edge of the raster in the network would have been misrepresented.

2. Calculate infection probability statistics for each stand

I then needed to relate each stand to the probability of infection of trees in that stand (generated by my model simulation and converted to point data in a previous exercise). In ArcMap, I used the “Spatial Join” tool to calculate statistics for infection probabilities in that stand, because each stand contains many trees. Statistics included the point count, median, mean, standard deviation, minimum, maximum, range, and sum.

3. Calculate the connectivity of each stand in the network of similarly managed stands in the landscape

3a. For this step, I used the free software Conefor, which calculates a variety of connectivity indices at the individual patch and overall landscape level. First, I used the Conefor extension for ArcMap to generate the input files for the Conefor analysis. The extension generates a “nodes” file for each feature and a “connection” file, which contains the distances between features a binary description of whether or not a link (“edge”) exists between two features. One can set the maximum distance for two features to be linked or generate a probability of connection based on an exponential decay function (built-in feature of Conefor, which is an incredible application). For my analysis, I performed connectivity analyses that only considered features to be linked if (i) they had the same management class and (ii) there were no more than 10 meters of distance between the stand boundaries. Ten meters is about the upper limit for the maximum likely root contact distance between two Douglas-fir trees.

3b. For each management class, I ran the Conefor analysis to calculate multiple metrics. I focused primarily on:

  • Number of links in the network
  • Network components – Each component is a set of connected patches (stands) that is internally connected but has no connection to any other set of patches.
  • Integral Index of Connectivity (IIC) – Essentially, this index gives each patch (stand) a value in terms of its importance for connectivity in the network based on its habitat attributes (e.g., area, habitat quality) and its topological position within the network. For this index, higher values indicate higher importance for connectivity. This is broken into three non-redundant components that sum to the total IIC:
    • IIC intra – connectivity within a patch
    • IIC flux – area-weighted dispersal flux
    • IIC connector – importance of a patch for connecting other patches in the network) (Saura and Rubino 2010)
  1. Analyze the relationship between connectivity metrics and infection probabilities

I reduced the mosaic to include only feature for each stand, eliminating those at the periphery and keeping those in the core. I confirmed that the values were similar for all of the copies of each stand near the center of the mosaic. I then mapped and plotted different combinations of connectivity and infection probability metrics to analyze the relationship for each management class (Fig. 1, Fig. 2).

Fig. 1. Map of IIC connectivity index and mean infection probability for the extensively managed stands.

RESULTS

I generally found no relationship between infection probability and the various metrics of connectivity. As connectivity increased, infection probabilities did not change for any of the metrics I examined (Fig. 2). I would like to analyze this for a series of landscape simulations in the future to see whether patterns emerge. I could also refine the distance used to generate links between patches to reflect the dispersal distance for the insects that vector the disease.

Fig. 2. Plots of infection probability statistics and connectivity metrics for each of the stands in the landscape. Each point represents one stands in the randomly distributed landscape, with extensively managed stands in red, intensively managed stands in blue, and old-growth stands in green.

CRITIQUE OF THE METHOD – What was useful, what was not?

I had originally planned to use the popular landscape ecology application Fragstats (or the R equivalent “landscapemetrics” package), but I ran into issues. As far as I could tell (though I may be incorrect), these options only use raster data and consider one value at a time. What I needed was for the analysis to consider groups of pixels by both their stand ID and their management class, because stands with the same management class are still managed independently. However, landscapemetrics would consider adjacent stands with the same management class to be all one patch. This meant that I could only calculate metrics for the entire landscape or the entire management class, which did not allow me to look at how each patch’s position relative to similarly or differently managed patches related to its probability of infection. In contrast, Conefor is a great application that allows for calculation of a large number of connectivity metrics at both the patch and landscape level.

References

Saura, S. & J. Torné. 2009. Conefor Sensinode 2.2: a software package for quantifying the importance of habitat patches for landscape connectivity. Environmental Modelling & Software 24: 135-139.

Saura, S. & L. Rubio. 2010. A common currency for the different ways in which patches and links can contribute to habitat availability and connectivity in the landscape. Ecography 33: 523-537.

Comparing the Spatial Patterns of Remnant Infected Hemlocks, Regeneration Infected Hemlocks, and Non-Hosts of Western Hemlock Dwarf Mistletoe

Overview

For Exercise 1, I wanted to know about the spatial pattern of western hemlock trees infected with western hemlock dwarf mistletoe. I used a hotspot analysis to determine where clusters of infected and uninfected trees were in my 2.2 ha study area (Map 1). I discovered a hot spot and a cold spot, indicating two clusters, one of high values (infected) and one of low values (uninfected).

For Exercise 2, I wanted to know how the spatial pattern of these clustered infected and uninfected trees were related to the spatial pattern of fire refugia outlined in my study site. I used Geographically weighted Regression to determine the significance of this relationship, however I did not find a significant relationship between a western hemlock, its intensity of clustering and infection status, and it’s distance to its nearest fire refugia polygon.

This result led to the realization that the polygons as they were drawn on the map were not as relevant as the actual “functional refugia”. I hypothesized that, after the 1892 fire, the only way for western hemlock dwarf mistletoe to spread back into the stand would be from the trees that survived that fire, or “remnant” trees. These would then infect the “regeneration” tree that came after the 1892 fire. The functional refugia I am interested in are defined by the location of the remnant western hemlocks. I also hypothesized that the spatial pattern of non-susceptible host tree (trees that were not western hemlocks) would play a role in the distribution of the mistletoe.

Question Asked

How are the spatial patterns of remnant western hemlocks related to the spatial patterns of regeneration western hemlocks, uninfected western hemlocks, and non-western hemlock tree species, and how are these relationships related to the spread of western hemlock dwarf mistletoe in the stand?

The Kcross and Jcross Functions

The cross-type functions (also referred to as multi-type functions) are tools capable of comparing the spatial patterns of two different type events (type i and j) in a similar spatial window, of some point process, X. It does this by assigning labels to the events differentiating the type and summarizing the number or distance, of and between events, at differing spatial scales, or radius circles (r).

The statistic Kij (r) , summarizes the number of type j events, around a type i event at a distance of r, or a point process X. Deviations of the observed Kij(r) curve from the the Poisson curve, or if type j events are truly randomly distributed, indicates dependence of type j events on type i events. Similar results can be obtained from the regular Ripley’s K: deviations above the curve indicate clustering and deviations below indicate dispersal.

(Incredibly helpful and interactive explanation link: https://blog.jlevente.com/understanding-the-cross-k-function/)

The statistic Jij(r) = (1 – Gij(r))/(1-Fj(r)) summarizes the shortest distance between a type i and j event and compares it to the empty space function of type j event. This is another test for inferring independence or dependence of type j events to type i. Deviations of the Jij(r) curve the value of 1, indicate levels of dependence of the events to each-other. Specific deviations from 1 can be hard to interpret without an understanding of the Fj(r) function so imagining it stationary in the ratio makes it easier. As Gij(r) increases, the numerator shrinks, creating a smaller Jij(r) statistic. Deviations below 1 indicate that type i and j events are dependent and that as r increases, the shortest distance between points of type i and j increases. As Gij(r) decreases, the numerator grows, creating a larger Jij(r) statistic. Deviations above 1 indicate that type i and j events are dependent and that as r increases, the shortest distance between points of type i and j decreases.

Methods Overview

In R, the package “spatstat” provides a suite of spatial statistic functions including the cross-type functions. In order to use these you need to create a “point pattern process” object. These objects incorporate X and Y coordinates, and a frame of reference, or a “window,” and give spatial context top a list of values. Then marks are applied to these points that create the necessary multi-type point pattern process object. These marks serve to distinguish the type i and type j events described earlier in your analysis. Then running the “Kcross()” or “Jcross()” functions with the specified type events produces a graph that you can interpret, very similar to producing the normal Ripley’s K plot.

  • I took my X – Y coordinates of all trees on the stand and added a column called “Status” to serve as my mark for the point pattern analysis.
    1. The four statuses were “Remnant,” “Regen,” “Uninfected,” and “NonHost” to identify my populations of interest.
      1. I had access to tree cores, so I identified trees that were older than 170 yrs old and these trees’ diameters served as my cutoff for the “Remnant” diameter class.
      2. All trees DBH > 39.8 cm.
    2. Doing this in ArcMap removed steps I would have to have taken when I migrated the dataset to R.
    3. I removed all the dead trees because I wasn’t concerned with them for my analysis.
  • I exported this attribute table to a csv and loaded it into R Studio.
  • I created the boundary window of my study site using the “owin()” function, and the corner points from my study site polygon.
  • The function “ppp()” creates the point pattern object and I assigned the marks to the data set using the “Status” column I created in ArcMap
    1. It’s important your marks are factors otherwise it is not converted into a multi-type point pattern object.
  • The last step is running the “Kcross()” and “Jcross” to compare the “Remnant” population to the “Regen,” “Uninfected,” and “NonHost” populations.
    1. This produced 6 plots, 3 of each type of cross-type analysis.
    2. Compare these easily using the “par()” function, for example:

par(mfrow = c(1,3))

plot(Ex3.Kcross1)

plot(Ex3.Kcross2)

plot(Ex3.Kcross3)

This produces the three plots in a single row and three columns.

Results

Because I am assuming the remnant, infected western hemlock trees are one of the main factors for the spread of western hemlock dwarf mistletoe and that they are the center of  new infection centers on the study site, I did all my analysis centered on the remnant trees (points with status = “Remnant” treated as event type i).

1)   i = Remnant, j = Regen

The first analysis between remnant and regeneration trees demonstrate that there is dependence on the two events to each other. At fairly small distances, or values of r, infected western hemlocks that have regenerated after the 1892 fire cluster around infected remnant western hemlocks that survived the 1892 fire. This stands to reason because we assume that infected trees will be near other infected trees, and that infection centers start usually with a “mother tree.” In this case the remnant trees serve as the start of the new infection centers. The Jcross output also shows me that the two types of trees are clustered using the frequency of the shortest distances. After ~8 meters the two tree types exhibit definite clustering. In terms of the function, the Gij(r) in the numerator of the Jij(r) function is approaching 1, or the highest frequency of very short distances.

2)   i = Remnant, j = Uninfected

The Kcross plot from the second set of analyses between remnant and uninfected trees demonstrates that there is independence between the two events up to ~15 meters. After that, the trees exhibit slight clustering effects. The lack of dispersal tendencies is strange for these two types of trees because we expect uninfected trees to be furthest away from the center of infection centers. The presence of clustering may be indicative of the small spatial scale of my study site. It may also be that the size of the infection centers are only about 15 meters (if we assume that remnant trees are the center). The Jcross plot shows something similar: at small distances the types of trees seem independent and then around 8 meters they exhibit clustering.

3)   i = Remnant, j = NonHost

The Kcross from the last set of analyses between the remnant trees and the non-hosts demonstrates a similar pattern exhibited by the regeneration trees. After about 4 meters, the trees tend to be clustered. This is an interesting find because if the non-hosts cluster to remnant trees but uninfected trees are independent, then the non-hosts may be playing a role in this. The Jcross plot shows the same: the two types of trees exhibit clustering.

4)   Comparing Kcross Functions with eval.fv()

A useful way to compare patterns of Kcross functions is using the eval.fv function. The titles of each plot tell which Kcross was subtracted from which; note the difference in scales. The first plot shows that the regenerating trees’ spatial pattern as related to remnant trees is very different from the uninfected trees’ pattern. The regenerating trees’ spatial pattern is much more similar to the non-hosts’ spatial pattern at short distances, until about 15 meters. Then the patterns differ with the regenerating trees exhibiting more of a clustering tendency. However the scale is much smaller than the other two graphs. Lastly, the third plot shows the difference between the non-host trees’ spatial pattern and the uninfected trees’ spatial pattern. There appears to be a stepwise relationship where, at very near and very far distances the non-host trees are much more clustered, but at moderate distances the differences may be less dramatic.

Critique of Cross-Type Functions

The amount of easily interpretable literature on the spatstat package as a whole is sparse, although a wealth of very technical information exists. The function was easy to use and execute though and so was the process of creating the point pattern object. These two functions can clearly show how the spatial patterns of the two types of events change with scale. It would be helpful if there was a way to compare three or more types of events. The last drawback is that there is a lack of specific information for each point on your map or study site. This pattern that is generalizable to a whole set of points may not be as useful when trying to put together a story, such as the story of a stand’s development through time.

Additional Raster Analysis

The last critique of the cross-type functions led me to attempt a visualization of these patterns on my stand. Very briefly, I determined densities of the infected, uninfected, and the non-host trees using the Kernel Density function in ArcMap. Then I classified these densities using natural breaks and coded these for raster addition. After adding all three density rasters together, I coded each unique density classification combination to tell me how the densities of the populations appeared in the study site.

It appears that there are distinct patches of high density separated by areas of low density. On the eastern side of my study site, it appears that the high density areas of infected trees cluster with the remnant infected trees. An interesting interaction is occurring between the high density patches of uninfected trees and infected trees in the western portion of my study site. The mechanism for the seemingly clear divide may be the non-TSHE trees.

Epidemiology of Western Hemlock Dwarf Mistletoe Post – Mixed Severity Fire

Description of the Research Question

My study site is located in the HJ Andrews Experimental Forest, where the two most recent mixed severity fires burned leaving a sizable fire refugia in the middle of the stand. Western hemlock dwarf mistletoe (WHDM) survived the fire in this refugia. WHDM spreads via explosively discharged seed and rarely by animals. This means that for the pathogen to spread, the seed must reach susceptible hosts. WHDM is a obligatory parasite of western hemlock. The regeneration post fire may pose a barrier to the pathogens spread because of the structure and composition. Lastly, the structure of the fire refugia may determine the rate of spread. This is because the structure largely determines where infections exist in the vertical profile of the canopy.

My research question is: How is the spatial pattern of WHDM spread extent and intensification from fire refugia related to the spatial pattern of the structure and composition of the fire refugia and the surrounding regeneration via barriers to viable seed reaching susceptible hosts?

I have three objectives related to the spatial organization or the regenerating forest surrounding the fire refugia and one related to the fire refugia itself. My objectives are to determine how fire refugia affects dwarf mistletoe’s spread and intensification through:

  1. The stand density of regenerating trees and of surviving trees, post fire.
  2. The stand age and structure of regenerating trees and surviving trees in fire refugia.
  3. The tree species composition of the regenerating trees adjacent to fire refugia.

AND…

  1. Whether intensification dynamics of WHDM inside a fire refugia resemble those of WHDM in other infection centers.

Dataset Description

Spatial locations of the extent of spread and intensity (also referred to as severity) of WHDM infection include presence/absence of infection in susceptible hosts and will have a severity rating for each infected tree. The presence/absence data will be two measurements, one from 1992 and one from 2019. The intensity rating will be from 2019 only.

Spatial locations/descriptions of the structure and composition of the fire refugia and regeneration surrounding refugia include X,Y of each tree, a variety of forest inventory attributes such as diameters, heights, and species for each tree, and delineation of the fire refugia boundaries. This data has been measured several times: 1992, 1997, 2013, and 2019. The GPS coordinates were recorded with handheld GPs units most likely under canopy cover so resolution may vary.

These data sets are bounded by a 2.2 ha rectangle that confines the study area.

Hypotheses

The spatial pattern of western hemlock dwarf mistletoe in the study site will be several discrete clusters. This arrangement forms because WHDM spreads from an initial infection point outward. The initial infection serves as an approximate center point and the cluster, or infection center, grows outward from there. Separations between clusters are maintained by forest structure and composition or disturbances. New clusters form from remnant trees surviving disturbances, or random dispersal of seed long distances by animals. In this case, the fire refugia protect the remnant trees from disturbance and these trees are the new focal points for infection centers

The spatial pattern of the forest structure and composition will drive the direction and rate of spread of WHDM because variances in these two attributes will cause varying amounts of barriers to WHDM seed dispersal. Because seeds are shot from an infected tree, that seed needs to reach a new uninfected branch. This means physical barriers to spread can affect seed dispersal and non susceptible species will stop spread.

Approaches

I would like to utilize spatial analyses that allow me to understand how the clustering of WHDM is related to the boundaries of the fire refugia. Also, how the infection centers have changed over time utilizing forest structure and composition metrics of the regeneration surrounding the refugia. Lastly, something that can incorporate a severity rating on a scale instead of simply presence/absence data and describe the distribution of severely infected trees vs lightly infected trees and how that relates to the fire refugia boundary and forest metrics of the regeneration surrounding the refugia.

Expected Outcome

I would like to produce statistical relationships that can determine the significance of forest density, species composition, age, and structure on the ability of WHDM to spread. Also, I would like to produce statistical relationships that can describe whether or not a fire refugia alters the way WHDM spreads and intensifies when compared to commonly observed models. Maps as visuals for describing the change over time would be a useful end product as well.

Significance

Understanding the spread patterns of WHDM is important for resource managers seeking to increase biodiversity and produce forest products. Focus has shifted to creating silvivultural prescriptions that emulate natural disturbances that are still economically viable and that maintain ecosystem functions. Disturbance events can control WHDM but also create opportunities to increase its spread and intensification so managers need to have an understanding of how a particular forest structure will affect WHDM. Also, if we want to maintain biodiversity, understanding how WHDM infection centers created by fire develop is important. Fire frequency and severity may be increasing in the future and the loss of mixed severity fire would mean a significant loss of WHDM. Land managers seeking to emulate burns can use this information to plan burns that preserve patches of WHDM if desired and understand how the pathogen will progress 25 years later. This is not usually the case for forest pathogens.

Level of Preparation

I have worked in ArcMap quite a bit, but I haven’t much experience with the wide range of functionality of ArcInfo. I used ModelBuilder somewhat to keep my queries and basic analyses organized. No experience programming in Python. I have taken two stats classes before this using R and feel I have a working knowledge and have no problem learning new tools in it. However, I have very little work with image processing such as working with rasters and some small exposure to LiDAR processing.