I am working with a humpback whale dataset collected across the North Pacific from 2004-2006. Given the large spatial extent, I have selected a subset of data from the Gulf of Alaska (GOA) and would like to look for spatial patterns in the genetic diversity of the whales sighted in the GOA in relation to their environment. Complicating this problem is the fact that most of the data was collected opportunistically, making the spatial distribution of whale sightings a better reflection of where researchers collected the data and not indicative of whether or not environmental variables influence humpback whale habitat use.
Figure 1. North Pacific humpback whale sightings from SPLASH. The data include > 18,000 photo-identification records and 2,700 DNA profiles for 8,000+ unique individuals.
Figure 2. A subset of the SPLASH data for the Northern and Western Gulf of Alaska. The data subset includes 2,622 records (both photo-identification and DNA profiles) for 1,448 unique individuals.
Ultimately, I need to figure out a method that will allow me to get beyond the uneven (non-systematic) sampling effort to determine if there is any sort of spatial pattern in the data based on genetics and environmental features (i.e. depth, slope, etc). Two (among many) working hypotheses:
- Humpback whales are found in clusters at a particular depth or slope range.
- Humpback whales that share the same haplotype (maternally inherited mitochondrial DNA) cluster together.