Tag Archives: spatial analysis

Final project: Washing out the (black) stain and ringing out the details

BACKGROUND

In order to explain my project, especially my hypotheses, some background information about this disease is necessary. Black stain root disease of Douglas-fir is caused by the fungus Leptographium wageneri. It infects the roots of its Douglas-fir host, growing in the xylem and cutting the tree off from water. It spreads between adjacent trees via growth through root contacts and grafts and long-distance via insects (vectors) that feed and breed in roots and stumps and carry fungal spores to new hosts.

Forest management practices influence the spread of disease because of the influence on (i) the distance between trees (determined by natural or planted tree densities); (ii) adjacency of susceptible species (as in single-species Douglas-fir plantations); (iii) road, thinning and harvest disturbance, which create suitable habitat for insect vectors (stumps, dead trees) and stress remaining live trees, attracting insect vectors; and (iv) forest age distributions, because rotation lengths determine the age structure in managed forest landscapes and younger trees (<30-40 years old) are thought to be more susceptible to infection and mortality by the disease.

RESEARCH QUESTION

How do (B) spatial patterns of forest management practices relate to (A) spatial patterns of black stain root disease (BSRD) infection probabilities at the stand and landscape scale via (C) the spatial configuration and connectivity of susceptible stands to infection?

In order to address my research questions, I built a spatial model to simulate BSRD spread in forest landscapes using the agent-based modeling software NetLogo (Wilensky 1999). I used Exercises 1-3 to focus on the spatial patterns of forest management classes. Landscapes were equivalent in terms of the proportion of each management class and number of stands, varying only in spatial pattern of management classes. In the exercises, I evaluated the relationship between management and disease by simulating disease spread in landscapes with two distinct spatial patterns of management:

  • Clustered landscape: The landscape was evenly divided into three blocks, one for each management class. Each block was evenly divided into stands.
  • Random landscape: The landscape was evenly divided into stands, and forest management classes were randomly assigned to each stand.

MY DATA

I analyzed outputs of my spatial model. The raster files contain the states of cells in forest landscapes at a given time step during one model run. States tracked include management class, stand ID number, presence/absence of trees, tree age, probability of infection, and infection status (infected/not infected). Management class and stand ID did not change during the model run. I analyzed tree states from the last step of the model run and did not analyze change over time.

Extent: ~20 hectares (much smaller than my full models runs will be)

Spatial resolution: ~1.524 x 1.524 m cells (maximum 1 tree per cell)

Three contrasting and realistic forest management classes for the Pacific Northwest were present in the landscapes analyzed:

  • Intensive – Active management: 15-foot spacing, no thinning, harvest at 37 years.
  • Extensive – Active management: 10-foot spacing, one pre-commercial and two commercial thinnings, harvest at 80 years.
  • Set-aside/old-growth (OG) – No active management: Forest with Douglas-fir in Pacific Northwest old-growth densities and age distributions and uneven spacing with no thinning or harvest.

HYPOTHESES: PREDICTIONS OF PATTERNS AND PROCESSES I LOOKED FOR

Because forest management practices influence the spread of disease as described in the “Background” section above, I hypothesized that the spatial patterns of forest management practices would influence the spatial pattern of disease. Having changed my experimental design and learned about spatial statistics and analysis methods throughout the course, I hypothesize that…

  • The “clustered” landscape will have (i) higher absolute values of infection probabilities, (ii) higher spatial autocorrelation in infection probabilities, and (iii) larger infection centers (“hotspots” of infection probabilities) than the “random” landscape because clustering of similarly managed forest stands creates continuous, connected areas of forest managed in a manner that creates suitable vector and pathogen habitat and facilitates the spread of disease (higher planting densities, lower age, frequent thinning and harvest disturbance in the intensive and extensive management). I therefore predict that:
    • Intensive and extensive stands will have the highest infection probabilities with large infection centers (“hotspots”) that extend beyond stand boundaries.
      • Spatial autocorrelation will therefore be higher and exhibit a lower rate of decrease with increasing distance because there will be larger clusters of high and low infection probabilities when stands with similar management are clustered.
    • Set-aside (old-growth, OG) stands will have the lowest infection probabilities, with small infection centers that may or may not extend beyond stand boundaries.
      • Where old-growth stands are in contact with intensive or extensive stands, neighborhood effects (and edge effects) will increase infection probabilities in those OG stands.
    • In contrast, the “random” landscape will have (i) lower absolute values of infection probabilities, (ii) less spatial autocorrelation in infection probabilities, and (iii) smaller infection centers than the “clustered” landscape. This is because the random landscape will have less continuity and connectivity between similarly managed forest stands; stands with management that facilitates disease spread will be less connected and stands with management that does not facilitate the spread of disease will also be less connected. I would predict that:
      • Intensive and extensive stands will still have the highest infection probabilities, but that the spread of infection will be limited at the boundaries with low-susceptibility old-growth stands.
        • Because of the boundaries created by the spatial arrangement of low-susceptibility old-growth stands, clusters of similar infection probabilities will be smaller and values of spatial autocorrelation will be lower and decrease more rapidly with increasing lag distance. At the same time, old-growth stands may have higher infection probabilities in the random landscape than in the clustered landscape because they would be more likely to be in contact with high-susceptibility intensive and extensive stands.
      • I also hypothesize that each stand’s neighborhood and spatial position relative to stands of similar or different management will influence that stand’s infection probabilities because of the difference in spread rates between management classes and the level of connectivity to high- and low-susceptibility stands based on the spatial distribution of management classes.
        • Stands with a large proportion of high-susceptibility neighboring stands (e.g., extensive or intensive management) will have higher infection probabilities than similarly managed stands with a small proportion of high-susceptibility neighboring stands.
        • High infection probabilities will be concentrated in intensive and extensive stands that have high levels of connectivity within their management class networks because high connectivity will allow for the rapid spread of the disease to those stands. In other words, the more connected you are to high-susceptibility stands, the higher your probability of infection.

APPROACHES: ANALYSIS APPROACHES I USED

Ex. 1: Correlogram, Global Moran’s I statistic

In order to test whether similar infection probability values were spatially clustered, I used the raster package in R (Hijmans 2019) to calculate the global Moran’s I statistic at multiple lag distances for both of the landscape patterns. I then plotted global Moran’s I vs. distance to create a correlogram and compared my results between landscapes.

Ex. 2: Hotspot analyses (ArcMap), Neighborhood analyses (ArcMap)

First, I performed a non-spatial analysis comparing infection probabilities between (i) landscape patterns (ii) management classes, and (iii) management classes in each of the landscapes. Then, I used the Hotspot Analysis (Getis-Ord Gi*) tool in ArcMap to identify statistically significant hot- and cold-spots of high and low infection probabilities, respectively. I selected points within hot and cold spots and used the Multiple Ring Buffer tool in ArcMap to create distance rings, which I intersected with the management classes to perform a neighborhood analysis. This neighborhood analysis revealed how the proportion of each management class changed with increasing distance from hotspots in order to test whether the management “neighborhood” of trees influenced their probability of infection.

Ex. 3: Network and landscape connectivity analyses (Conefor)

I divided my landscape into three separate stand networks based on their management class. Then, I used the free landscape connectivity software Conefor (Saura and Torné 2009) to analyze the connectivity of each stand based on its position within and role in connecting the network using the Integrative Index of Connectivity (Saura and Rubio 2010). I then assessed the relationship between the connectivity of each stand and infection probabilities of trees within that stand using various summary statistics (e.g., mean, median) to test whether connectivity was related to infection probability.

RESULTS: WHAT DID I PRODUCE?

As my model had not been parameterized by the beginning of this term, I analyzed “dummy” data, where infection spread probabilities were calculated as a decreasing linear function of distance from infected trees. However, the results I produced still provided insights as to the general functioning of the model and factors that will likely influence my results in the full, parameterized model.

I produced both maps and numerical/statistical relationships that describe the patterns of “A” (infection probabilities), the relationship between “A” and “B” (forest management classes), and how/whether “A” and “B” are related via “C” (landscape connectivity and stand networks).

In Exercise 1, I found evidence to support my hypothesis of spatial autocorrelation at small scales in both landscapes and higher autocorrelation and slower decay with distance in the clustered landscape than the random landscape. This was expected because the design of the model calculated probability of infection for each tree as a function of distance from infected trees.

In Exercises 2 and 3, I found little to no evidence to support the hypothesis that either connectivity or neighboring stand management had significant influence on infection probabilities. Because the model that produced the “dummy” data limited infection to ~35 meters from infected trees and harvest and thinning attraction had not been integrated into infection calculations, this result was not surprising. In my full model where spread via insect vectors could span >1,000 m, I expect to see a larger influence of connectivity and neighborhood on infection probabilities.

A critical component of model testing is exploring the “parameter space”, including a range of possible values for each parameter. This is especially for agent-based models where there are complex interactions between many individuals that result in larger-scale patterns that may be emergent and not fully predictable by the simple sum of the parts. In my model, the disease parameters of interest are the factors influencing probability of infection (Fig. 1). In order to understand how the model reacts to changes in those parameters, I will perform a sensitivity analysis, systematically adjusting parameter values one-by-one and comparing the results of each series of model runs under each set of parameter values.

Fig.1. Two of the model parameters that will be systematically adjusted during sensitivity analysis. Tree susceptibility to infection as a function of age (left) and probability of root contact as a function of distance (right) will both likely influence model behavior and the relative levels of infection probability between the three management classes.

This is especially relevant given that in Exercises 1 through 3, I found that the extensively managed plantations had the highest values of infection probability and most of the infection hotspots, likely due to the fact that this management class has the highest [initial] density of trees. For the complete model, I am hypothesizing that the intensive plantations will have the highest infection probabilities because of high frequency of insect-attracting harvest and short rotations that maintain the trees in an age class highly susceptible to infection. In the full model, the extensive plantations will have higher initial density than the intensive plantations but will undergo multiple thinnings, decreasing tree density but attracting vectors, and will be harvested at age 80, thus allowing trees to grow into a less susceptible age class. In this final model, thinning, harvest length, and vector attraction will factor in to the calculation of infection probabilities. My analysis made it clear that even a 1.5 meter difference in spacing resulted in a statistically significant difference for disease transmission, with much higher disease spread in the denser forest. Because the model is highly sensitive to tree spacing, likely because the parameters of my model that relate to distance drop off in sigmoidal or exponential decay patterns, I would hypothesize that changes in the values of parameters that influence the spatial spread of disease (i.e., insect dispersal distance, probability of root contact with distance) and the magnitude of vector attraction after harvest and thinning will determine whether the “extensive” or “intensive” forest management class will ultimately the highest levels of infection probabilities. In addition, the rate of decay of root contact and insect dispersal probabilities will determine whether management and infection within stands influence infection in neighboring stands and the distance and strength of those neighborhood effects. I would like to test this my performing such analyses on the outputs from my sensitivity analyses.

SIGNIFICANCE: WHAT DID I LEARN FROM MY RESULTS? HOW ARE THESE RESULTS IMPORTANT TO SCIENCE? TO RESOURCE MANAGERS?

Ultimately, the significance of this research is to understand the potential threat of black stain root disease in the Pacific Northwest and inform management practices by identifying evidence-based, landscape-scale management strategies that could mitigate BSRD disease issues. While the results of Exercises 1-3 were interesting, they were produced using a model that had not been fully parameterized and thus are not representative of the likely actual model outcomes. Therefore, I was not able to test my hypotheses. That said, this course allowed me to design and develop an analysis to test my hypotheses. The exercises I completed have also provided a deeper understanding of how my model works. Through this process, I have begun to generate additional testable hypotheses regarding model sensitivity to parameters and the relative spread rates of infection in each of the forest management classes. Another key takeaway is the importance of producing many runs with the same landscape configuration and parameter settings to account for stochastic processes in the model. By only analyzing one run for each scenario, there is a chance that the results are not representative of the average behavior of the system or the full range of behaviors possible for those scenarios. For example, with the random landscape configuration, one generated landscape can be highly connected and the next highly fragmented with respect to intensive plantations, and only a series of runs under the same conditions would provide reliable results for interpretation.

WHAT I LEARNED ABOUT… SOFTWARE

(a, b) Arc-Info, Modelbuilder and/or GIS programming in Python

This was my first opportunity to perform statistical analysis in ArcGIS, and I used multiple new tools, including hotspot analysis, multiple ring buffers, and using extensions. Though I did not use Python or Modelbuilder, I realized that doing so will be critical for automating my analyses given the large number of model runs I will be analyzing. While I learned how to program in Python using arcpy in GEOG 562, I used this course to choose the appropriate tools and analyses for my questions and hypotheses rather than automating procedures I may not use again. I would now like to implement my procedures for neighborhood analysis in Python in order to automate and increase the efficiency of my workflow.

(c) Spatial analysis in R

During this course, I learned most about spatial data manipulation in R, since I had limited experience using R with spatial data beforehand. I used R for spatial statistics, data cleaning and management, and conversion between vector and raster data. I also learned about the limitations of R (and my personal limitations) in terms of the challenge of learning how to use packages and their functions when documentation is variable in quality and a wide variety of user-generated packages are available with little reference as to their quality and reliability. For example, for Exercise 2, I had trouble finding an up-to-date and high-quality package for hotspot analysis in R, with raster data or otherwise. However, this method was straightforward in ArcMap once the data were converted from raster to points. For Exercise 1, the only Moran’s I calculation that I was able to perform with my raster data was the “moran” function in the raster package, which does not provide z- or p-values to evaluate the statistical significance of the calculated Moran’s I and requires you to generate your own weights matrices, which is a pain. Using the spdep or ncf packages for this analysis was incredibly slow (though I am not sure why), and the learning curve for spatstat was too steep for me to overcome by the Exercise 1 deadline (but I hope to return to this package in the future).

Reading, manipulating, and converting data: With some trial and error and research into the packages available for working with spatial data in R (especially raster, sp/spdep, and sf), I learned how to quickly and easily convert data between raster and shapefile formats, which was very useful in automating the cleaning and preparation for multiple datasets and creating the inputs for the analyses I want to perform.

(d) Landscape connectivity analyses: I learned that there are a wide variety of metrics available through Fragstats (and landscapemetrics and landscapetools packages in R), however, I was not able to perform my desired stand-scale analysis of connectivity because I could not determine whether it is possible to analyze contiguous stands with the same management class as separate patches (Fragstats considered all contiguous cells in the raster with the same class to be part of the same patch). Instead, I used Conefor, which has an ArcMap extension that allows you to generate a node and connection file from a polygon shapefile, to calculate relatively few but robust and ecologically meaningful connectivity metrics for the stands in my landscape.

WHAT I LEARNED ABOUT… SPATIAL STATISTICS

Correlograms and Moran’s I: For this statistical method, I learned the importance of choosing meaningful lag distances based on the data being analyzed and the process being examined. For example, my correlogram consists of a lot of “noise” with many peaks and troughs due to the empty cells between trees, but I also captured data at the relevant distances. Failure to choose appropriate lag distances means that some autocorrelation could be missed, but analyses of large raster images at a high resolution of lag distances results in very slow processing. In addition, I wanted to compare local vs. global Moran’s I to determine whether infections were sequestered to certain portions of the landscape or spread throughout the entire landscape, but the function for local Moran’s I returned values far outside the -1 to 1 range of the global Moran’s I. As a result, I did not understand how to interpret or compare these values. In addition, global Moran’s I did not tell me where spatial autocorrelation was happening, but the fact that there was spatial autocorrelation led me to perform a…

Hotspot analysis (Getis-Ord Gi*): It became clear that deep conceptual understanding of hypothesized spatial relationships and processes in the data and a clear hypothesis are critical for hotspot analysis. I performed multiple analyses with difference distance weighting to compare the results, and there was a large variation in both the number of points included in hot and cold spots and the landscape area covered by those spots between the different weighting and distance methods. I ended up choosing the inverse squared distance weighting based on my understanding of root transmission and vector dispersal probabilities and because this weighting method was the most conservative (produced the smallest hotspots). The confidence level chosen also resulted in large variation in the size of hotspots. After confirming that there was spatial autocorrelation in infection probabilities, using this method helped me to understand where in the landscape these patterns were occurring and thus how they related to management practices.

Neighborhood analysis: I did not find this method provided much insight in my case, not because of the method itself but because of my data (it just confirmed the landscape pattern that I had designed, clustered vs. random) and my approach (one hotspot and one coldspot point non-randomly selected in each landscape. I also found this method to be tedious in ArcMap, though I would like to automate it, and I later learned about the zonal statistics tool, which can help make this analysis more efficient. In general, it is not clear what statistics I could have used to confirm whether results were significantly different between landscapes, but perhaps this is an issue caused by my own ignorance.

Network/landscape connectivity analyses: I learned that there are a wide variety of tools, programs, and metrics available for these types of analyses. I found the Integrative Index of Connectivity (implemented in Conefor) particularly interesting because of the way it categorizes habitat patches based on multiple attributes in addition to their spatial and topological positions in the landscape. The documentation for this metric is thorough, its ecological significance has been supported in peer-reviewed publications (Saura and Rubio 2010), and it is relatively easy to interpret. In contrast, I found the number of metrics available in Fragstats to be overwhelming especially during the data exploration phase.

REFERENCES

Robert J. Hijmans (2019). raster: Geographic Data Analysis and Modeling. R package version 2.8-19. https://CRAN.R-project.org/package=raster

Saura, S. & J. Torné. 2009. Conefor Sensinode 2.2: a software package for quantifying the importance of habitat patches for landscape connectivity. Environmental Modelling & Software 24: 135-139.

Saura, S. & L. Rubio. 2010. A common currency for the different ways in which patches and links can contribute to habitat availability and connectivity in the landscape. Ecography 33: 523-537.

Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.

Exercise 3: Lagoons, ENSO Indices, and Dolphin Sightings

Exercise 3: Are bottlenose dolphin sightings distances to nearest lagoon related to ENSO indices in the San Diego, CA survey site?

1. Question that you asked

I was looking to see a pattern at more than one scale, specifically the relationship with ENSO and sighting distributions off of San Diego. I asked the question: do bottlenose dolphin sighting distributions change latitudinally with ENSO related to distance from the nearest lagoon. The greater San Diego area has six major lagoons that contribute the major estuarine habitat to the San Diego coastline and are all recognized as separate estuaries. All of these lagoons/estuaries sit at the mouths of broad river valleys along the 18 miles of coastline between Torrey Pines to the south and Oceanside to the north. The small boat survey transects cover this entire stretch with near-exact overlap from start to finish. These habitats are known to be highly dynamic, experience variable environmental conditions, and support a wide range of native vegetation and wildlife species.

Distribution of common bottlenose dolphin sightings in the San Diego study area along boat-based transects with the six major lagoons.

 

FID NAME
0 Buena Vista Lagoon
1 Agua Hedionda Lagoon
2 Batiquitos Lagoon
3 San Elijo Lagoon
4 Tijuana Estuary
5 Los Penasquitos Lagoon
6 San Dieguito Lagoon

2. Name of the tool or approach that you used.

I utilized the “Near” tool in ArcMap 10.6 that calculated the distance from points to polygons and associated the point with FID of that nearest polygon. I also used R Studio for basic analysis, graphic displays, and ANOVA with Tukey HSD.

3. Brief description of steps you followed to complete the analysis.

  1. I researched the San Diego GIS database for the layer that would be most helpful and found the lagoon shapefile.
  2. Imported the shapefile into ArcMap where I already had my sightings, transect line, and 1000m buffered transect polygon.
  3. I used the “Near” tool in the Analysis toolbox, part of the of the “proximity toolset”. I chose the point to polygon option with my dolphin sightings as the point layer and the lagoon polygons as the polygon layer.
  4. I opened the attribute table for my dolphin sightings and there was now a NEAR_FID and NEAR_DIST which represented the identification (number) related to the nearest lagoon and the distance in kilometers to the nearest lagoon, respectively.
  5. I exported using the “conversion” tool to Excel and then imported into R studio for further analyses (ANOVA between the differences in sighting distances to lagoons and ENSO indices).

4. Brief description of results you obtained

After a quick histogram in ArcMap, it was visually clear that the distribution of points with nearest lagoons appeared clustered, skewed, or to have a binomial distribution, without considering ENSO. Then, after importing into R studio, I created a box plot of the distance to nearest lagoon compared to the ENSO index (-1, 0, or 1). I ran an ANOVA which returned a very small p-value of 2.55 e-9. Further analysis using a Tukey HSD found that the differences between ENSO states of neutral (0) and -1 and neutral and 1 were significant, but not between 1 and -1. These results are interesting because this means the sightings of dolphins differ most during neutral ENSO years. This could be that certain lagoons are preferred during extremes compared to the neutral years. Therefore, yes, there is a difference in dolphin sightings distances to lagoons during different ENSO phases, specifically the neutral years.

Histogram comparing the distance from the dolphin sighting to nearest lagoon in San Diego during the three major indices of El Niño Southern Oscillation (ENSO): -1, 0, and 1.

 

Violin plot showing the breakdown of distributions of dolphin sighting distances to lagoons (numbered 0-6) during the three different ENSO indices.

5. Critique of the method – what was useful, what was not?

This method was incredibly helpful and also was the easiest to apply once I got started, in comparison to my previous steps. It allowed to both visualize and quantify interesting results. I also learned some tricks for how to better graph my data and to symbolize my data in ArcMap.


Contact information: this post was written by Alexa Kownacki, Wildlife Science Ph.D. Student at Oregon State University. Twitter: @lexaKownacki

Exercise 2: Possible Influence of ENSO Index on Dolphin Sighting Latitudes

Exercise 2

Question Asked: Are latitudinal differences in dolphin sightings in the San Diego, CA survey area related to El Niño Southern Oscillation (ENSO) index values on a monthly temporal scale?

  1. My previous question for Exercise 1 was: do the number of dolphin sightings in the San Diego, CA survey region differ latitudinally? I was finally able to answer this question with a histogram of sighting count by latitudinal difference. I defined latitudinal difference as the difference from the highest latitude of dolphin sightings (the Northernmost sighting point along the San Diego transect line) to the other sighting points, in decimal degrees. Therefore it becomes a simple mathematical subtraction in ArcMap. Smaller differences would be the result of a small difference and therefore mean more Northerly sighting, with large differences being from more Southerly areas. I used all sightings in the San Diego region (from 1981 through 2015). As you can see from below, there is an unequal distribution of sightings at different latitudes. Because I had visual confirmation of differences at least when all sightings are binned (in terms of all years from 1981-2015 treated the same), I looked for what process could be affecting these differences in latitude.

    Comparing the Latitudes with the frequency of dolphin sightings in San Diego, CA

ENSO is a large-scale climate phenomena where the climate modes periodically fluctuate (Sprogis et al. 2018). The climate variability produced by ENSO affects physical oceanic and coastal conditions that can both directly and indirectly influence ecological and biological processes. ENSO can alter food webs because climate changes may impact animal physiology, specifically metabolism. This creates further trophic impacts on predator-prey dynamics, often because of prey availability (Barber and Chavez 1983). During the surveys of bottlenose dolphins in California, multiple ENSO cycles have caused widespread changes in the California Current Ecosystem (CCE), such as the squid fishery collapse (Nezlin, Hamner, and Zeidberg 2002). With this knowledge, I wanted to see if the frequency of dolphin sightings in different latitudes of the most-consistently studied area was driven by ENSO.

Tool/Approach:

Primarily R Studio, some ArcMap 10.6 and Excel

Step by Step:

  1. 1.For this portion of the analysis, I exported my table of latitudinal differences within my attribute table for dolphin sightings from ArcMap 10.6. I saved this as a .csv and imported it into R Studio.
  2. Some of the sighting data needed to be changed because R didn’t recognize the dates as dates, rather as factors. This is important in order to join ENSO data by month and year.
  3. Meanwhile, I found NOAA data on a publicly-sourced website that had months as the columns and years as the rows for a matching ENSO index value of either: 1, 0, or -1 for each month/year combination. A value of 1 is a positive (warm) year, a value of 0 is a neutral year, and a value of -1 is a negative (cold) year. This is a broad-value, because indices range from 1 to -1. But, to simplify my question this was the most logical first step.
  4. I had to convert the NOAA data into two-column data with the date in one column by MM/YYYY and then the Index value in the other column. After multiple attempts in R studio, I hand-corrected them in Excel. Then, imported this data into R studio.
  5. I was then able to tell R to match the sighting date’s month and year to the ENSO data’s month and year, and assign the respective ENSO value. Then I assigned the ENSO values as factors.
  6. I created a boxplot to visualize if there were differences in distributions of latitudinal differences and ENSO index. (See figure)Illustrating the number of sightings grouped by ENSO index values (1, 0, and -1).
  7. Then I ran an ANOVA to see if there was a reportable, strong difference in sighting latitudinal difference and ENSO index value.

    Results:

     

    From the boxplot, it appears that in warm years (ENSO index level of “1”), the dolphins are sighted more frequently in lower latitudes, closer to Mexican waters when compared to the neutral (“0”) and cold years (“-1”). This result is intriguing because I would have expected dolphins to move northerly during warm months to maintain similar body temperatures in the same water temperatures. However, warm ENSO years could shift prey availability or nutrients southerly, which is why there are more sightings further south.  The result of the ANOVA, was a p-value of <2e-16, providing very strong evidence to reject the null of hypothesis of no difference. I followed up with a Tukey HSD and found that there is strong evidence for differences between both the 0 and -1, -1 and 1, and 1 and 0 values. Therefore, the different ENSO indices on a monthly scale are significantly contributing to the differences in sighting latitudes in the San Diego study area.

Tukey HSD output:

diff               lwr                        upr           p adj

0–1 0.01161047 0.004250827 0.01897011 0.0006422

1–1 0.04101170 0.030844193 0.05117920 0.0000000

1-0 02940123 0.020689737 0.03811272 0.0000000

 Critique of the Method(s):

These methods worked very well for visualization and finally solidifying that there was a difference on sighting latitude related to ENSO index value on a broad level. Data transformation and clean-up was challenging in R, and took much longer than I’d expected.

 

References:

Barber, Richard T., and Francisco P. Chavez. 1983. “Biological Consequences of El Niño.” Science 222 (4629): 1203–10.

Sprogis, Kate R., Fredrik Christiansen, Moritz Wandres, and Lars Bejder. 2018. “El Niño Southern Oscillation Influences the Abundance and Movements of a Marine Top Predator in Coastal Waters.” Global Change Biology 24 (3): 1085–96. https://doi.org/10.1111/gcb.13892.


Contact information: this post was written by Alexa Kownacki, Wildlife Science Ph.D. Student at Oregon State University. Twitter: @lexaKownacki

The Biogeography of Coastal Bottlenose Dolphins off of California, USA between 1981-2016

Background/Description:

Common bottlenose dolphins (Tursiops truncatus), hereafter referred to as bottlenose dolphins, are long-lived, marine mammals that inhabit the coastal and offshore waters of the California Current Ecosystem. Because of their geographical diversity, bottlenose dolphins are divided into many different species and subspecies (Hoelzel, Potter, and Best 1998). Bottlenose dolphins exist in two distinct ecotypes off the west coast of the United States: a coastal (inshore) ecotype and an offshore (island) ecotype. The coastal ecotype inhabits nearshore waters, generally less than 1 km from shore, between Ensenada, Baja California, Mexico and San Francisco, California, USA (Bearzi 2005; Defran and Weller 1999). Less is known about the range of the offshore ecotype , which is broadly defined as more than 2 km offshore off the entire west coast of the USA (Carretta et al. 2016). Current population abundance estimates are 453 coastal individuals and 1,924 offshore individuals (Carretta et al. 2017). The offshore and coastal bottlenose dolphins off of California are genetically distinct (Wells and Scott 1990).

Both ecotypes breed in summer and calve the following summer, which may be thermoregulatory adaptation (Hanson and Defran 1993). These dolphins are crepuscular feeders that predominantly hunt prey in the early morning and late afternoon (Hanson and Defran 1993), which correlates to the movement patterns of their fish prey. Out of 25 prey fish species, surf perches and croakers make up nearly 25% of coastal T. truncatus diet (Hanson and Defran 1993). These fish, unlike T. truncatus, are not federally protected, and neither are their habitats. Therefore, major threats to dolphins and their prey species include habitat degradation, overfishing, and harmful algal blooms (McCabe et al. 2010).

This project aims to better understand that distribution of coastal bottlenose dolphins in the waters off of California, specifically in relation to distance from shore, and how that distance has changed over time.

Data:

This part of the overarching project focuses on understanding the biogeography of coastal bottlenose dolphins. Later stages in the project will require the addition of offshore bottlenose sightings to compare population habitats.

Beginning in 1981, georeferenced sighting data of coastal bottlenose dolphin off the California, USA coast were collected by R.H. Defran and team. The data were provided in the datum, NAD 1983. Small boats less than 10 meters in length were used to collect the majority of the field data, including GPS points, photographs, and biopsy samples. These surveys followed similar tracklines with a specific start and end location, which will be used to calculate the sighting per unit effort. Over the next four decades, varying amounts of data were collected in six different regions (Fig. 1). Coastal T. truncatus sightings from 1981-2015 parallel much of the California land mass, concentrating in specific areas (Fig. 2). Many of the sightings are clustered nearby larger cities due to logistics of port locations. The greater number of coastal dolphin sightings is due to the bias in effort toward proximity to shore and longer study period. All samples were collected under a NOAA-NMFS permit.Additional data required will likely be sourced from publicly-available, long-term data collections, such as ERDDAP or MODIS.

Distance from shore will be calculated in a program such as ArcGIS or R package. These data will be used later in the project to compare to additional static, dynamic, and long-term environmental drivers. These factors will be tested as possible layers to add in mapping and finally estimating population distribution patterns of the dolphins.

Figure 1. Breakdown of coastal bottlenose dolphin sightings by decade. Image source: Alexa Kownacki.

 

 

 

 

 

 

 

 

 

 

 

Hypotheses:

I predict that the coastal bottlenose dolphins will be associated with different bathymetry patterns and appear clustered based on a depth profile via mechanisms such as prey distribution and abundance, nutrient plumes, and predator avoidance.

Approaches:

My objective is to first find a bathymetric layer that covers the coast of the entirety of California, USA to import into ArcMap 10.6. Then I need to interpolate the data to create a smooth surface. Then, I can add my dolphin sighting points and create a way to associate each point with a depth. These depth and point data would be exported to R for further analysis. Once I have extracted these data, I can run a KS-test to compare the shape of distribution based on two different factors, such as points from El Niño years versus La Niña years to see if there is a difference in average sighting depth or more common sighting depths based on the climatic patterns. I am also interested in using the spatial statistic analysis tool, Moran’s I, to see if the sightings are clustered. If so, I would run a cluster analysis to see if the sightings are clustered by depth. If not, then maybe there are other drivers that I can test, such as distance from shore, upwelling index values, or sea surface temperature. Additionally, these patterns would be analyzed over different time scales, such as monthly, seasonally, or decadally.

Expected Outcome:

Ideally, I would produce multiple maps from ArcGIS representing different spatial scales at defined increments, such as by month (all Januaries, all Februaries, etc.), by year or binned time increment (i.e. 1981-1989, 1990-1999), and also potentially grouping based on El Niño or La Niña year. Different symbologies would represent coastal dolphin sightings distances from shore. The maps would visually display seafloor depths in a color spectrum by 10 meter difference. Because the coastlines of California vary in terms of depth profiles, I would expect there to be clusters of sightings at different distances from shore, but similar depth profiles if my hypothesis is true. Also, data with the quantified values of seafloor depth would be associated with each data point (dolphin sighting) for further analysis in R.

Significance:

This project draws upon decades of rich spatiotemporal and biological information of two neighboring long-lived cetacean populations that inhabit contrasting coastal and offshore waters of the California Bight. The coastal ecotype has a strong, positive relationship with distance to shore, in that it is usually sighted within five kilometers, and therefore is in frequent contact with human-related activities. However, patterns of distances to shore over decades, related to habitat type and possibly linked to prey species distribution, or long-term environmental drivers, is largely unknown. By better understanding the distribution and biogeography of these marine mammals, managers can better mitigate the potential effects of humans on the dolphins and see where and when animals may be at higher risk of disturbance.

Preparation:

I have a moderate amount of experience in ArcMap from past coursework (GEOG 560 and 561), as well as practical applications and map-making. I have very little experience in Modelbuilder and Python-based GIS programming. I am becoming more familiar with the R program after two statistics courses and analyzing some of my own preliminary data. I am experienced in image processing in ACDSee, PhotoShop, ImageJ, and other analyses mainly from marine vertebrate data through NOAA Fisheries.

Literature Cited:

Bearzi, Maddalena. 2005. “Aspects of the Ecology and Behaviour of Bottlenose Dolphins (Tursiops Truncatus) in Santa Monica Bay, California.” Journal of Cetacean Research Managemente 7 (1): 75–83. https://doi.org/10.1118/1.4820976.

Carretta, James V., Kerri Danil, Susan J. Chivers, David W. Weller, David S. Janiger, Michelle Berman-Kowalewski, Keith M. Hernandez, et al. 2016. “Recovery Rates of Bottlenose Dolphin (Tursiops Truncatus) Carcasses Estimated from Stranding and Survival Rate Data.” Marine Mammal Science 32 (1): 349–62. https://doi.org/10.1111/mms.12264.

Carretta, James V, Karin A Forney, Erin M Oleson, David W Weller, Aimee R Lang, Jason Baker, Marcia M Muto, et al. 2017. “U.S. Pacific Marine Mammal Stock Assessments: 2016.” NOAA Technical Memorandum NMFS, no. June. https://doi.org/10.7289/V5/TM-SWFSC-5.

Defran, R. H., and David W Weller. 1999. “Occurrence , Distribution , Site Fidelity , and School Size of Bottlenose Dolphins ( Tursiops T R U N C a T U S ) Off San Diego , California.” Marine Mammal Science 15 (April): 366–80.

Hanson, Mark T, and R.H. Defran. 1993. “The Behavior and Feeding Ecology of the Pacific Coast Bottlenose Dolphin, Tursiops Truncatus.” Aquatic Mammals 19 (3): 127–42.

Hoelzel, A. R., C. W. Potter, and P. B. Best. 1998. “Genetic Differentiation between Parapatric ‘nearshore’ and ‘Offshore’ Populations of the Bottlenose Dolphin.” Proceedings of the Royal Society B: Biological Sciences 265 (1402): 1177–83. https://doi.org/10.1098/rspb.1998.0416.

McCabe, Elizabeth J.Berens, Damon P. Gannon, Nélio B. Barros, and Randall S. Wells. 2010. “Prey Selection by Resident Common Bottlenose Dolphins (Tursiops Truncatus) in Sarasota Bay, Florida.” Marine Biology 157 (5): 931–42. https://doi.org/10.1007/s00227-009-1371-2.

Wells, Randall S., and Michael D. Scott. 1990. “Estimating Bottlenose Dolphin Population Parameters From Individual Identification and Capture-Release Techniques.” Report International Whaling Commission, no. 12.

——-

Contact information: this post was written by Alexa Kownacki, Wildlife Science Ph.D. Student at Oregon State University. Twitter: @lexaKownacki