Humpback whales feed in the temperate high latitudes along the North Pacific Rim from California to Japan during the spring, summer, and fall and migrate in the winter to the near-tropical waters of Mexico, Hawaii, Japan, and the Philippines to give birth and mate (Calambokidis et al. 2001; Calambokidis et al. 2008). Although whales show strong site fidelity to feeding and breeding grounds, genetic analysis of maternally inherited DNA (mitochondrial DNA or mtDNA) reveals greater mixing of individuals on the feeding grounds (Baker et al. 2008). This mixing makes it difficult to determine regional population structure and may complicate management decisions. For example, should the feeding grounds be managed as one population unit or is there evidence to suggest that more than one management unit is present? If more than one, are they affected differently by coastal anthropogenic activities, and therefore, require population specific management strategies?
With this in mind, I decided to explore the spatial pattern of humpback whales from the Western and Northern Gulf of Alaska, a subset of data collected during the SPLASH Project (Structure of Populations, Levels of Abundance, and Status of Humpbacks; http://www.cascadiaresearch.org/splash/splash.htm). Specifically, I am interested in the following questions:
- Do whales form clusters? Do whales that are more closely related (have the same mtDNA haplotype) cluster together?
- Are there spatial patterns in whale distribution based on depth? Do more closely related whales cluster together based on depth?
- Are there spatial patterns in whale distribution based on slope? Do more closely related whales cluster together based on slope?
The bathymetry layer, GEBCO_08 Grid, version 20091120 ( http://www.gebco.net) was used for depth and slope analyses in questions 2 and 3. Depth data were extracted using ArcGIS 10.1 Extract Values to Points tool within the Spatial Analyst Toolbox. Slope values were derived from the bathymetry data using ArcGIS 10.1 and the Slope tool; slope values were then extracted using the Extract Values to Points tool within the Spatial Analyst Toolbox.
**Results presented here are strictly for the purposes of exploring the functionality of the ArcGIS tools found in the Spatial Statistics Toolbox. They should be considered preliminary and should not be reproduced elsewhere.**
Part 1: Average Nearest-Neighbor Analysis
This tool is based on the null hypothesis of complete spatial randomness and calculates a nearest neighbor index based on the average distance from each feature to its nearest neighboring feature. The nearest-neighbor ratio is calculated as the Observed Mean Distance divided by the Expected Mean Distance and has a value of 1 under complete spatial randomness. Values greater than 1 indicate a dispersed pattern, while values less than 1 indicate a clustered pattern.
Clustering in whales
Haplotype |
n |
Obs Mean Dist (m) |
Exp Mean Dist (m) |
N-N Ratio |
Z-score |
p-value |
Pattern |
All |
788 |
1913.54 |
19537.25 |
0.0979 |
-48.443 |
0.00000 |
Clustered |
A- |
202 |
5962.24 |
33738.56 |
0.1767 |
-22.385 |
0.00000 |
Clustered |
A+ |
220 |
8753.78 |
34046.85 |
0.2571 |
-21.080 |
0.00000 |
Clustered |
A3 |
91 |
7404.13 |
34929.24 |
0.2120 |
-14.381 |
0.00000 |
Clustered |
E1 |
73 |
16789.95 |
50701.04 |
0.3312 |
-10.932 |
0.00000 |
Clustered |
E3 |
46 |
12843.30 |
34918.80 |
0.3679 |
-8.203 |
0.00000 |
Clustered |
F2 |
83 |
14402.10 |
51259.82 |
0.2810 |
-12.532 |
0.00000 |
Clustered |
The output of this tool indicates that whales, regardless of mtDNA haplotype, are significantly clustered in the Western and Northern Gulf of Alaska. This result is not entirely surprising, given that humpback whales tend to form small groups on the feeding grounds. However, the results of this tool are very sensitive to changes in the study area, and therefore it is best to use this tool with a fixed study area. This approach was not done for the current analysis. Instead, the area of the minimum enclosing rectangle around the input features was used and this area varied for each haplotype variable.
Based on the results it seems the average nearest neighbor tool may not be the most appropriate tool for discovering spatial patterns in humpback whales. However, it would be worth running the tool again using a fixed study area before discarding its utility for this data set completely.
Alternatively, it would be worth conducting a refined nearest-neighbor analysis in which the variable of interest (mtDNA) is the complete distribution function of all observed nearest neighbor distances (not just the mean nearest-neighbor distance) and use a specified distance with which to test for complete spatial randomness. This method is not currently available within the ArcGIS Spatial Statistic Toolbox and would need be conducted in another software package such as R.
Part 2: Hot Spot Analysis
This tool uses the Getis-Ord Gi* statistic to identify statistically significant hot spots (clusters of high values) and cold spots (clusters of low values) given a set of weighted features. For each feature in the data set, a Gi* statistic is returned as a z-score. The larger the positive z-score, the more intense the clustering of high values (hot spot). The smaller the negative z-score, the more intense the clustering of the low values (cold spot).
Figure 1. The output scale for the hot spot analysis tool. When interpreting the results, it is useful to remember that a feature mapped as bright red may not be because its value is particularly large but because it is part of a spatial cluster of high values. Conversely, a feature mapped as bright blue may not be because its value is particularly small but because it is part of a spatial cluster of low values. Thus, the more positive a z-score is, the hotter the hot spot (darker red), while the more negative a z-score is, the colder a cold spot (darker blue).
Spatial patterns in whale distribution based on depth
Figure 2. Results of hot spot analysis for all whales (n=799) based on depth (m), no mtDNA considered.
The output from this tool shows the presence of several hot and cold spots regardless of mtDNA haplotype (Figure 2). The hot spots (red) indicate that whales in these areas occur at shallower depths and the results are statistically significant. There are also several statistically significant cold spots (blue) where whales are found at deeper depths, often beyond the continental shelf.
Figure 3. Results of hot spot analysis by haplotype based on depth (m).
The output by haplotype also shows the presence of several hot and cold spots, although the location of each varies by haplotype (Figure 3). The A+ and A- haplotypes show statistically significant hot spots in the Northern Gulf of Alaska while the E1 and F2 haplotypes show a less intense cluster of in the same region, although still significant. The E1 haplotype also shows a significant hot spot in the Western Gulf of Alaska. These hot spots reflect whales clustering by haplotype at shallower depths. The A3 and E3 haplotypes have relatively little clustering – no hot spots and a very small cold spot in the western region. In general, for all haplotypes, cold spots are located in the western region or beyond the continental shelf where whales cluster at deeper depths.
Spatial patterns in whale distribution based on slope
Figure 4. Results of hot spot analysis for all whales (n=799) based on slope (degrees), no mtDNA considered.
The output from this tool shows the presence of several hot and cold spots regardless of mtDNA haplotype (Figure 4). The hot spots (red) indicate that whales in these areas occur at steeper slopes and the result is statistically significant. There are also several statistically significant cold spots (blue) where whales are found at flatter slopes.
Figure 5. Results of hot spot analysis by haplotype based on slope (degrees).
The output by haplotype also shows the presence of several hot and cold spots, although the location of each varies by haplotype (Figure 5). The A+, A-, A3 and F2 haplotypes show statistically significant hot spots in the Northern Gulf of Alaska while the A3, F2, and E3 (to a lesser extent) haplotypes also show hot spots in the western region. These hot spots reflect whales clustering by haplotype at steeper slopes. The A+, A-, A3, and F2 haplotypes have statistically significant cold spots in the northern region, while a cold spot for the E1 haplotype occurs in the western region. These cold spots reflect whales clustering by haplotype at flatter slopes.
Reflecting on my results, I initially thought perhaps the hot/cold spot patterns found might be influenced by the uneven sampling effort and differences in sample size. However, on 23 May 2013 Lauren Scott from Esri commented on this very subject in response to a posting by Jen Bauer (http://blogs.oregonstate.edu/geo599spatialstatistics/2013/04/24/discerning-variables-spatial-patterns-within-a-clustered-dataset/#comment-1393). Lauren stated that even if sampling is uneven (e.g., many samples are taken from some areas, while fewer samples are taken at others), the impact to the results of a hot spot analysis will be minimal. She provided the following for further clarification. In areas with many samples, the tool will have more information to compute its result. The tool will “compare the local mean based on lots of samples to the global mean based on ALL samples for the entire study area and decide if the difference is statistically significant or not”. In areas with fewer samples, “the local mean will be computed from only a few observations/samples… the tool will compare the local mean (based on only a few pieces of information) to the global mean (based on ALL samples) and determine if the difference is significant”. Thus, my concern seems to be unwarranted.
In general, the hot spot tool seems to be more useful than the average nearest neighbor tool for the humpback whale data set used here. Statistically significant clustering of whales occurs with and without consideration of mtDNA for both depth and slope. Although preliminary, the results from this tool highlight areas for further investigation using additional spatial analysis techniques.
Challenges discovered with the ArcGIS Spatial Statistics Toolbox
My biggest challenges using the ArcGIS Spatial Statistics Toolbox are twofold. First, many of the tools require the use of a numeric variable (either continuous or discrete) and do not support “out of the box” categorical variables, such as mtDNA haplotype. Thus, in order to look for spatial patterns in haplotypes, I had to split the data up by haplotype, create separate feature classes for each haplotype, and then run the tool several times to get my results. Given that I was working with a small data set, the repetition was relatively painless but I am certain it would be useful to have this process automated (perhaps using model builder or python scripting). Not only would this speed up processing but it would also eliminate the addition of human induced error. Second, the hot spot analysis only allows for the input of one variable at a time. What if one suspected that the spatial pattern of humpback whales (with or without mtDNA consideration) is related to depth and another environmental variable (e.g. sea surface temperature, productivity or currents)? I believe this type of analysis would need to be conducted in another software package such as R.
~~~~~~~~~~~~~~~~~~~~~
Baker, C. S., D. Steel, J. Calambokidis, J. Barlow, A. M. Burdin, P. J. Clapham, E. Falcone, J. K. B. Ford, C. M. Gabriele, U. González-Peral, R. LeDuc, D. Matilla, T. J. Quinn, L. Rojas-Bracho, J. M. Straley, B. L. Taylor, J. Urbán Ramírez, M. Vant, P. R. Wade, D. Weller, B. H. Witteveen, K. Wynne, and M. Yamaguchi. 2008. geneSPLASH: an initial, ocean-wide survey of mitochondrial (mt) DNA diversity and population structure among humpback whales on the North Pacific. Final Report for contract 2006-0093-008, submitted to National Fish and Wildlife Foundation.
Calambokidis, J., E.A. Falcone, T. J. Quinn, A. M. Burdin, P. J. Clapham, J. K. B. Ford, C. M. Gabriele, R. LeDuc, D. Mattila, L. Rojas-Bracho, J. M. Straley, B. L. Taylor, J. Urbán, D. Weller, B. H. Witteveen, M. Yamaguchi, A. Bendlin, D. Camacho, K. Flynn, A. Havron, J. Huggins, N. Maloney, J. Barlow, and P. R. Wade. 2008. SPLASH: Structure of Populations, Levels of Abundance and Status of Humpback Whales in the North Pacific. Final report for Contract AB133F-03-RP-00078 from U.S. Dept of Commerce.
Calambokidis, J., G.H. Steiger, J. M. Straley, L. M., Herman, S. Cerchio, D. R. Salden, U. R. Jorge, J. K. Jacobsen, O. V. Ziegesar, K. C. Balcomb, C. M. Gabriele, M. E. Dahlheim, S. Uchida, G. Ellis, Y. Mlyamura, P. de guevara Paloma Ladrón, M. Yamaguchi, F. Sato, S. A. Mizroch, L. Schlender, K. Rasmussen, J. Barlow, and T. J. Q. Ii. 2001. Movements and population structure of humpback whales in the North Pacific. Marine Mammal Science. 17:769–794.
Great analysis, Dori!
It’s interesting that A- had the lower z-score for you too! It would be interesting to see what results you’d find with sex-biased clustering in this region of Alaska. In SEAK, males were more clustered than females and I’m curious to know if it would be the same in the Gulf. I really liked that you used slope and depth to tease out patterns with this data and I think you’re on to something with thinking about running a clustering analysis using two environmental variables. Very cool stuff!
Cheers,
Sophie
Thanks Sophie,
After seeing your analysis, I am definitely curious about whether or not there are differences in distribution by sex and will give that a whirl over the summer. I’ll let you know what happens.
– Dori
Nice analysis and use of the hotspot tool. It looks like from your analysis that this tool might have the same limitations that other spatial statistical tools have with regards to the need for adequate data density to ensure meaningful results. Good luck with your future analyses!