Tag Archives: Courtney

Courtney’s Final Project Post

Research Question

  • “How is the spatial pattern of ion and isotope concentrations in wells tapping the basalt aquifer related to the spatial pattern of mapped faults via the mechanism of groundwater flow as determined by hydraulic transmissivity of the geologic setting?”

Description of dataset that I examined

  • A: In my research I have analytical data for 31 wells, whose XY locations were determined by field confirmation of the Oregon Water Resource Department (OWRD) well log database. As groundwater is a 3D system, I have to consider Z values as well. The well depths and lithology information are also from the OWRD database. My analytical data provides a snapshot of water chemistry during the summer of 2018. I have only one temporal data point per well. At all 31 wells, I collected samples to be analyzed for pH, temperature, specific conductivity, oxygen isotopes 16 and 18, and hydrogen isotopes 1 and 2. At a subset of 18 of those wells I collected additional samples for tritium, carbon 14, and major ion analysis.
  • B: The shapefile of faults mapped at the surface was created by Madin and Geitgey of the USGS in their 2007 publication on the geology of the Umatilla Basin. There is some uncertainty in my analysis as far as extending this surface information into the subsurface. USGS studies have constrained proposed ranges of dip angles for the families of faults that I am studying, but not exact angles for any single mapped fault.
  • C: results of pumping interference tests involving 29 wells, 12 of which I had chemical data for. The data was collected by the OWRD in 2018 and 2019.

Hypotheses

  • Faults can act as conduits or barriers to groundwater flow, depending on how their transmissivity compares to the transmissivity of the host rock.
  • I hypothesize that clusters of similar chemical and isotopic properties of groundwater can indicate a shared aquifer unit/compartment, and that if faults separate clusters then the fault is influencing that difference in chemical/isotopic signatures. If the fault is between two clusters, I hypothesize that it is acting as a barrier. If it crosses through a cluster, I hypothesize that it acts as a conduit.
  • Where faults act as barriers, I hypothesize that parameter values will differ in groups on either side of a fault. Specifically, a barrier fault might cause older, warmer water to rise into upper aquifer layers, and the downstream well might show a signature of more local recharge.
  • Where faults act as conduits, I hypothesize that water chemistry and isotopes of samples from wells on either side of the fault would indicate a relatively direct flowpath from the upstream well to a downstream well. Over a short distance, this means that ion and isotope concentrations would not differ significantly in wells across the fault.
  • My hypotheses depend on a “barrier” fault damming groundwater flow up-gradient of the fault, and compartmentalizing local recharge on the down-gradient side. This hypothesis is only relevant if the fault is roughly perpendicular to the flow direction, and so disrupting transmissivity between a recharge zone and the wells. If a fault that separates two wells is parallel to the flow direction and there is no obstacle between the wells and upstream recharge areas, then the fault might indeed limit communication between the wells but they will have similar chemical signatures. Wells separate by this second kind of fault barrier would be better evaluated by a physical test of communication, such as a pumping interference test.

Analysis Approaches

  • Principal component analysis: used to simplify the multivariate data set (19 variables!) into variable relationships that could represent aquifer processes
  • Analysis of PCA results compared to distance from a flow path
    • Interpolation of well water levels classified by well stratigraphy to estimate a potentiometric surface and evaluate groundwater flow directions.
    • Raster calculations to compare flow direction to fault incidence angle
    • Measuring distance from each well to the nearest fault along the flow path
    • simple linear regression, comparing Non-ion PC1 score of a well with its distance from a fault.
  • Two-sided T-tests comparing distance between wells, presence of a fault, and pumping interference test communication between wells
  • ANOVA F-tests comparing chemical and isotopic variance within groups of wells that communicate with each other and between those groups.

Results

  • Principal component analysis – Patterns of variation between wells are statistically explained primarily by total ion concentration, a relationship between chemical evolution from ion exchange and decreasing stable isotope ratios, and the combination of well depth and water temperature. Moran’s I indicates that only Non-ion PC2 is spatially clustered, while the other PC models have a distribution not significantly different than random. The other PC models are useful to understand the groundwater system, but not specifically to analyze clustering correlated to faults.
  • Interpolation of water level, and comparison of fault incidence angle with flow direction, indicates faults that are and are not able to be tested by my hypotheses.
  • Analysis of PCA results compared to distance from a fault along flow path – some wells that are “within” a fault zone have very old signatures and others have very young signatures. This could be related to the angle of the dip of the fault and the accuracy of mapping compared to the depth of the well. I hypothesize that the wells that are in the fault zone but have high PC1 scores are on the up-gradient side of the fault where older water is upwelling along a barrier. Wells in fault zones with low PC1 scores could indicate wells open to downgradient areas of the fault, where vertical recharge through the fault damage zone is able to reach the well.
  • Returning to the conclusions I wrote in that blog post after I found improved stratigraphic data, I’m not sure if I can make conclusions other than those about the wells are that mapped as “inside” a fault. Several wells that are closer to faults are also open to shallower aquifer units, and so the effect of lower PC1 scores closer downgradient to faults might be confounded by lower PC1 scores caused by vertical inflow from the sedimentary aquifer and upper Saddle Mountain aquifer.
  • Two-sided T-tests comparing distance between wells, presence of a fault, and communication between wells show that the presence of a fault has a greater effect on communication than the distance between the wells.
  • ANOVA F-tests comparing chemical and isotopic variance within groups of wells that communicate with each other and between those groups – stable isotopes and specific conductivity both show more variation between well groups than within well groups.
  • Not covering in these blog posts, I also ran Moran’s I on my inputs to see which ones are clustered and so might be more related to horizontal grouping factors (such as faults) than vertical grouping parameters (such as stratigraphic unit). Of the PCA and individual variables, only d18O, d2H, and Non-ion PC2(combination of well depth and water temperature) were clustered. The other PCA models, temperature, pH, and specific conductivity were not significantly spatially clustered.

Significance –  Groundwater signatures are related to faults agree/disagree with past understandings of differences between wells in the region, and can inform well management. If a senior permit holder puts a call on the water right and asks for junior users to be regulated off, it would not help that senior user if on of those junior permit holders’ wells is not hydraulically connected to the senior users.

  • More wells would need to be sampled to be better able to disentangle the effects of faults from the effects of well stratigraphy.

My learning – I learned significantly more about how to code and troubleshoot in R. Additionally, I learned about the process of performing spatial clustering analysis in ArcGIS.

What did I learn about statistics?

  • PCA was completely new to me, and it’s a cool method for dealing with multivariate data once I dealt with the steep learning curve involved in setting it up and interpreting the results. It was useful getting more practice performing and interpreting t-tests and Anova F-tests. I had not used spatial clustering before, and learning how to apply it was interesting. It gave me a much more concrete tool to try to disentangle the patterns in my effectively 3D data on X,Y plane, as opposed to the Z direction.

Ex 3: Relating distance between wells to communication type and fault presence

Question that I asked:

Is there a relationship between the distance between wells, their communication status based on a pumping interference test, and whether or not they are separated by a fault?

Name of the tool or approach used:

Polyline creation and classification in ArcGIS Pro, boxplot creation and two-sided t-tests in R.

Method:

29 wells in my study area were evaluated by the Oregon Water Resources Department during pumping interference tests in 2018 and 2019. This test involves pumping one well, and seeing whether the water levels in nearby wells drop in response. I received a verbal account of the wells that did and did not communicate, sketched it on a map, and then transferred that information to ArcIS Pro. I drew polylines using the well locations as snapped endpoints. Then, I created and populated fields containing the well communication type (“communicate” for wells that respond to pumping at a nearby well, and “does not communicate” for wells that do not) and whether or not the path between two wells crosses a fault. Shape_Length in feet was automatically calculated when I created the polylines, on account of the projection I used for the shapefile.

I exported that data table to a csv and imported it in R, where I subset it into three categories: all paths, paths between wells that communicate, and paths between wells that do not communicate. I then created box plots and ran t-tests to see differences between means and distributions of path length based on communication type or fault length.

Results:

Comparing the path length and the communication type of all 29 wells involved in the communication test, there is not significant evidence of a difference in mean path length between wells that do and do not communicate because the p-value of a two-sided t-test was 0.152. While the mean distance between wells that do not communicate is larger than the mean distance between wells that do communicate, the overlapping interquartile ranges in both categories make this difference less significant. There is not clear evidence that distance plays a role in well communication.

There is some evidence for a difference in mean path lengths between wells that do and do not cross faults, based on a p-value of 0.047 in a two-sided t-test. The mean path length that crosses a fault is 5,139 ft, while the mean path length that does not cross a fault is 3,608 ft. Wells that are closer together are less likely to be separated by a fault.

For wells that do communicate, there is evidence of a difference between the mean path lengths that cross faults and the mean path lengths that do not cross faults. The p-value for a two-sided t-test was 0.024. Wells that communicate but are not separated by a fault are more likely to be closer together than wells that are separated by a fault.

For wells that do not communicate, there is no evidence of a difference in mean path lengths between paths that do and do not cross faults, given a p-value of 0.98 in a two-sided t-test. Wells that do not communicate are likely to be separated by the same mean distance whether or not they are separated by faults, although there is a larger range of path length values for wells separated by a fault that do not communicate.

 

Summary of results:

Wells that communicate in pumping tests do not have a significantly different mean distance between them than wells that do not communicate (p = 0.152)

Wells that are closer together are less likely to be separated by a fault. (p = 0.047)

Wells that communicate but are not separated by a fault are more likely to be closer together than communicating wells that are separated by a fault. (p = 0.024)

Wells that do not communicate are likely to be separated by the same mean distance whether or not they are separated by faults, although there is a larger range of path length values for non-communicating wells separated by a fault. (p = 0.98)

Critique: I wish I had more sample points and paths to work with, so I could use a more interesting analysis such as ANOVA.

Courtney’s EX2: Comparing faults and principal components

Question asked in this exercise:

How does ion principal component 2 at a well vary with the well’s distance from faults along the groundwater flow path?

In EX1, I used principal component analysis to evaluate how parameters accounted for variance between the wells I studied. Based on my knowledge of how chemistry varies as water flows through the basalt, ion PC2 accounts for variance caused by ion exchange between the basalt and the groundwater, with increasing sodium ion concentration/pH and decreasing calcium/magnesium ion concentrations as the water spends more time underground.

In this exercise, I estimated the groundwater flow directions in my study area using interpolation, calculated fault incidence direction, calculated angular difference between flow direction and fault direction, and then manual measurement of the distance between each well and the distance to the nearest fault segment that had flow direction within 45 degrees of parallel to the fault along the estimated flow path.

Name of tool or approach used:

Interpolation, reclassification, raster math, distance measurement in ArcGIS Pro

Methods:

Input data:

  • 2018 static water levels in wells provided by Oregon Water Resource Department (OWRD)
  • Well lithology from OWRD groundwater information system (GWIS)
  • Well seal depth from well logs accessed through OWRD GWIS
  • Fault polyline shapefile from Madin and Geitgey 2017
  • Well locations from OWRD database, with ion concentration information based on my sampling in the summer of 2018

Steps:

  1. Classified wells in the static water level dataset by the basalt formation that they were open to, based on lithology and seal depth. Excluded wells that lacked this information.
    1. Output: wells classified as open to Saddle Mountain Basalt (Smb) Wanapum Basalt (Wb), Grande Ronde Basalt (Grb), or both Wanapum and Grande Ronde Basalt (WbGrb). These classifications were joined to the static water level information.
  2. Created an interpolation surface for static water levels of wells open to both the Wanapum and Grande Ronde. I ignored the Saddle Mountain formation, since the wells that I had sampled were not open to it. I used kriging with a cell size of 200 ft, and this created an estimated potentiometric surface for these wells. The interpolation was a bit ugly because my wells were not ideally distributed for this.
    1. I tried creating interpolations based on other combinations of formations to better approximate the potentiometric surfaces posited by past studies in the region. I ended up creating two potentiometric surfaces: one using wells that were only open to the Wanapum Basalt (Wb), and a second using wells that were open only the Grande Ronde Basalt as well as those that were open to both the Grande Ronde Basalt and Wanapum Basalt(WbGrb_Grbonly).
  3. Calculated flow direction in the two aquifer groups – Wb and WbGrb_Grbonly
    1. Used the Hydrology toolbox to fill sinks and then calculate flow direction.
    2. This creates out a raster with eight possible values between 1 and 256.
    3. I then reclassified it so the values corresponded to the eight primary cardinal directions (N, NE, E, SE, S, SW, W, NW), which range from 0 to 360 degrees
  4. Calculated fault incidence angle
    1. Added a cell to the polyline attribute table called “angle”
    2. Split the polyline at each vertex, which creates a new shapefile
    3. Used the field calculator and a python script to assign an angular value between 0 and 359 to each fault polyline segment
    4. Performed the polyline to raster function, using the angle as the cell value. I used a cell size of 200 ft.
    5. This created a raster where only pixels that include part of a fault polyline had values, and those values ranged between 0 and 359.
  5. Subtracted the flow direction raster from the fault incidence angle raster, in order to create fault line pixels that had values that reflected the difference between the fault incidence angle and flow direction. I did this twice, once each for the Wb and WbGrb_Grbonly rasters
    1. Reclassified this raster so that pixels with fault direction within 45 degrees of parallel to the flow direction were 0, and the pixels with fault direction with 45 degrees of perpendicular to the flow direction were 1.
    2. I then added the WB and WbGrb_Grbonly results together, so that pixels with fault direction within 45 degrees of parallel to the flow direction in both were 0, and the pixels with fault direction within 45 degrees of perpendicular in either raster to the flow direction were 1, and pixels with fault direction within 45 degrees of perpendicular in both rasters were 2. I named this allwbgrb_ff.
  6. On a map layout, I added my sampled sites with PCA data, the interpolated potentiometric surface for wells open to the Wanapum and Grande Ronde aquifers, and allwbgrb_ff.
    1. I added a field to the sampled sites with PCA data called “dist_from_fault”
    2. Using the measure tool, I measure the distance from each well along the path most perpendicular to potentiometric contours to the nearest allwbgrb_ff pixel with a value of 1 or 2.
    3. This was subjective because the potentiometric surface is imperfect because of the erratic spacing of wells. In areas where the potentiometric surface had noticeable glitches, I used my own judgement based on topography and literature about groundwater flow direction in the region.
  7. I then graphed the “dist_from_fault” against the ionsPC2 category.

Discussion and Results:

I hypothesized that ion PC2 would increase with decreasing potentiometric surface elevation. An increased score in ionPC2 indicates an elevation sodium concentrations and pH and a decrease in calcium and magnesium caused by a progressive ion exchange reaction between the groundwater and the basalt. Because water flows from higher potentiometric elevations to lower potentiometric elevations, I would expect water samples from lower potentiometric elevations to show chemical evidence of increased interaction with the basalt. If this process of down-gradient groundwater flow were the only process influencing the ion exchange reactions, the well symbols on the map below would become progressively darker as the interpolated well level surface elevation decreased and the wells were further from the up-gradient recharge zone.

However, upon examining a map of ion PC2 values this is not the case – there are anomalously low values of PC2 in the valley, where one would expect to see increased values if groundwater flowed uninterrupted from the up-gradient recharge zones. In this study I introduced the variable of distance from faults in order to test another hypothesis: that faults compartmentalized groundwater flow, blocking lateral flow through the aquifer while promoting vertical permeability and modern recharge into the down-gradient aquifer. I also hypotehsized that if a fault was a barrier, PC2 values up-gradient of the fault would be elevated as the fault trapped water behind it. This would result in more chemically evolved groundwater backed up behind the fault, and less chemically evolved groundwater down-gradient of the fault.

The results of this study tentatively support the conceptual model of fault compartmentalization. In particular, water samples from wells in the valley down-gradient of the fault zones have evidence of less exposure to the basalt than wells further towards the recharge zones. 15 of the 18 wells sampled show a positive correlation between distance from a fault along a flow path and ion PC2 score, especially when graphed points are compared to their up-gradient and down-gradient neighbors (i.e. 57946 being up-gradient from 57235 and down-gradient from 54277).

Because of the width of the raster cells indicating faults and their flow direction in this model, four wells ended up have a distance from faults of 0. This does not seem unreasonable, because examination of exposed fault zones in the area indicated that many are up to a couple hundred meters wide. Additionally, while geological studies of the area indicate that the faults are close to vertical, their exact dip angles are unknown and this introduces a certain amount of uncertainty about their location at depth. Ion PC2 values of wells with dist_from_fault = 0 show an interesting dichotomy of either very high (4167, 4179) or very low (3929, 3962) values. I believe this indicates that wells 4167 and 4179 are up-gradient of faults that are acting as barrier, while 3929 and 3962 are slightly down-gradient or within the fault damage zone itself.

map of fault locations, potentiometric surface, and wells, plus a chart of distance vs ion PC2

Critique of Method:

This is admittedly a crude method of estimating groundwater flow direction. However, a certain amount of estimation is necessary to model a system with relatively few and unevenly spaced measured water level points in a hydrogeologic regime with many individual water-bearing layers of lava interflow zones that are unpredictably connected. I wish I had been able to find an automated way to measure well distance from faults along the groundwater flow path, which would have taken away some of the subjectivity of measuring the flow paths by hand the old-fashioned way.

Because of the uneven spacing of wells used for interpolation, the flow direction rasters that I created have glitchy areas and some of these areas of physically improbable values influenced the Fault and GW Flow Direction raster. I reclassified that raster into broad categories to avoid creating a false sense of precision.

Courtney’s Ex 1: Spatial patterns of ion concentrations in groundwater

1: Question being asked

How do ion and isotope concentrations vary spatially? And what trends in concentrations account for variability between my samples?

For detailed background information, I point you towards my problem statement for this class. In short, I’m hoping to use geochemical and isotope values for wells 32 well I sampled last summer in order to better understand how groundwater flows across faults in the Walla Walla Subbasin in northeastern Oregon. In the analysis for this post, I only used a subset of 18 wells for which I had the largest suite of measured parameters.

2: Approach used

For this exercise, I’m investigating spatial patterns of my ion and isotope concentrations. I could do this by individually examining distributions of all 19 of the parameters that I sampled for… but that would be graphically overwhelming and not terribly useful. It’s too difficult to draw conclusions by comparing 19 different maps of distributions.  Instead, I’m using a method called principal component analysis (PCA for short) to statistically evaluate my samples by the groups of parameters that account for large amounts of the variation that differentiates one sample from another.

The nuts and bolts of PCA involve sophisticated analysis that calculates eigenvectors, linear algebra, and n-dimensional properties of a data set, and I will let people more knowledgeable than me explain this if you’re curious about how exactly PCA works.

https://blog.bioturing.com/2018/06/14/principal-component-analysis-explained-simply/

R spits out two kinds of useful graphic output for PCA based on the concept of a “biplot”. The first is called a loading plot, where two principal components are graphed against each other, for example PC1 and PC2. The value for each parameter in PC1 and PC2 is used to give the point a cartesian coordinate which defines a vector in comparison to the origin. The second kind of output is called a scree plot, which applies the same concept to the sample points (called “individuals” in the code) instead of to the  parameters (called “variables” in the code). I used an R package called ggbiplot to combined the scree and loading plots in one figure.

https://blog.bioturing.com/2018/06/18/how-to-read-pca-biplots-and-scree-plots/

https://learnche.org/pid/latent-variable-modelling/principal-component-analysis/interpreting-score-plots-and-loading-plots

3: Methods

I used R to calculate the principal component vectors for my data and to create charts for them, and then symbolized my data in ArcGIS Pro to see if there was a spatial pattern to the information in the PCA output. The code in R is fairly manageable. I’ve included it below with my comments.

# Source: https://www.datacamp.com/community/tutorials/pca-analysis-r

# Preliminary set up stuff – add libraries, set working directory

library(devtools)

library(ggbiplot)

setwd(“~/_Oregon State University/Thesis/Summer Fieldwork/PCA”)

# Before importing your data, make sure that every entry in its table has a numeric entry in it.

# This process will spit out garbage if you have any blank cells or text.

# The number of rows in the data set needs to exceed the number of columns. I used an argument like [,c(1:11)] to enforce this in my data

#Process 1 of 2: read data set 1 into R, view it, compute PCA, display summaries and biplots

# it’s important to set the row names so we can use them as labels in ggbiplot

PCAdata.ions2 <- read.csv(“ions_allchem20190401_usgsdwnld.csv”, row.names = 1)

View(PCAdata.ions2)

Ions2.pca <- prcomp(PCAdata.ions2[,c(1:11)], center = TRUE, scale. = TRUE)

summary(Ions2.pca)

ggbiplot(Ions2.pca, choices=c(1,2), labels=rownames(PCAdata.ions2))

ggbiplot(Ions2.pca, choices=c(1,3), labels=rownames(PCAdata.ions2))

#read data set 2 into R, view it, compute PCA, display summaries and biplots

PCAdata.nonions <-read.csv(“NonIons_allchem20190401_usgsdwnld.csv”, row.names = 1)

View(PCAdata.nonions)

NonIons.pca <- prcomp(PCAdata.nonions[,c(1:8)],center = TRUE, scale. = TRUE)

summary(NonIons.pca)

ggbiplot(NonIons.pca, labels=rownames(PCAdata.nonions))

ggbiplot(NonIons.pca, choices=c(1,3), labels=rownames(PCAdata.nonions))

# if you want plots other than PC1 vs PC2 or PC1 vs PC3 just change the argument “choices = (1,2)”

# Process 2 of 2 add tools to get individual coordinates

# Source http://www.sthda.com/english/wiki/get-pca-extract-the-results-for-individuals-variables-in-principal-component-analysis-r-software-and-data-mining

# only install the package if you haven’t before. You will need to have RTools 3.5 already installed

# install.packages(“devtools”)

# library(“devtools”)

# install_github(“kassambara/factoextra”)

# now run things, it will spit out tab-delimited data. The “ind” command outputs PC information for the individuals, and the “var” command outputs the PC information for the variables.

library(factoextra)

ind <- get_pca_ind(Ions2.pca)

ind$coord

ind2 <- get_pca_ind(NonIons.pca)

ind2$coord

# copy that info into text files, import them into csv format, join in Arc!

var1 <- get_pca_var(Ions2.pca)

var1$coord

var2< get_pca_var(NonIons.pca)

var2$coord

Next, I symbolized my data in ArcGIS Pro. Because I only had 18 samples, it wasn’t too onerous to use selection in the attribute table to export data to another layer to symbolize the groups I saw in the principle component analysis biplot.

To examine how the principle components affected individual points, I used the second process in the above code to generate a tab-delimited chart of the principal component loadings for each sample. I copied this into a .txt file, imported it into Excel to create a CSV, and then joined that CSV to my shapefile of well locations. For the variables’ coordinates, I likewise created .txt files and imported the into Excel, but I symbolized them in Excel.

4: Results

In this section, I’ll show the graphs and maps that I created for the ion section of my data, and give an explanation of what I learned from this analysis.

For my ion data, the first two out of the 11 principle components explain about 80% of the variation in my data.

The first one, PCI, explains 55% of the variation in my data set and is shown on the X axis of the graph below. It is strongly influenced by concentrations of calcium, magnesium, chloride, alkalinity, sulfate, bromide, and sodium, which all have eigenvalues on the X axis greater than |1|. Comparing the scree plot to the loading plot, wells 56287, 3074, 4179, and 4167 are the primary individual drivers of this PC.

PC2 explains another 25% of the variance in my data, and is shown on the Y axis. Fluoride, silicon, and manganese are the primary variable drivers of variation between the individuals in PC2.

CvS PC1 PC2 ions biplot with groups

Based on the coordinate information for the variables, the variation explained by PC1 is due to increased levels of calcium, magnesium, potassium, sodium, alkalinity, bromide, chloride, and sulfate at four wells. This combination of ions all being similar could be explained by recent recharge influenced by agricultural chemicals, such as fertilizers. In an undisturbed basalt setting though time, calcium and magnesium tend to decrease as sodium and potassium increase. The fact the these four parameters vary together in PCI indicates that a process other than ion replacement is driving ion concentration. The variation in PC2 is predominantly explained by fluoride, dissolved silica, and manganese. Elevated concentrations of these three variables in comparison to the other variables indicates that the groundwater has spent more time in the basalt.In PC2, calcium and magnesium are inversely related to sodium, indicating that the cation concentrations are driven by the ion replacement process.

While the figure above locates wells based on their grouping on a biplot comparing two principle components, the following figures show the wells color-coded by their loadings in PC1 and in PC2. I’m hypothesizing that wells with smaller PC1 values are influenced by agricultural recharge, and wells with more positive values on PC2 are more influenced by the processes of ion dissolution as water travels through the basalt.

color-coded map of PC1 values for the 18 wells

PC1 is heavily driven by four particular wells, which show up here as being in colors of blue and blue-grey.

The distribution of PC2 values shows a pattern that I find really interesting. More negative – bluer – values indicate samples which were less influenced by the ion replacement and dissolution processes of groundwater traveling through the basalt. The values at high elevation are negative, which makes sense because they’re closer to the upland recharge zone, but so do some samples in the lowlands to the north which are downgradient of certain faults. This could indicate that those PC2 negative downgradient wells are cut off from the longer flowpaths of groundwater by faults that act as barriers to groundwater flow. The pinker – more positive – points indicate samples that were more influenced by the along-flowpath ion reactions.

Next, I present the PCA results for the non-ion parameters. Unlike the “ions” set whose title was somewhat self-explanatory, this PCA brought together different types of chemical and physical properties of water. These include pH, temperature, and specific conductivity which were measured in the field,  the analyzed values of hydrogen and oxygen isotopes as well as tritium, the elevation of the bottom of the well, and the cation ratio. The cation ratio is a standard calculated parameter for groundwater chemistry which is equal to ([Na] + [K]) / ([Ca] + [Mg]). As water spends more time in contact with the rock, its ion exchange with the rock leads to a smaller cation ratio. The first biplot graphs PC1 (~40% of variation) on the X  axis against PC2 (22% of variation) on the Y axis, and the second biplot graphs PC1 on the X axis versus PC3 (~18% of variation) on the Y axis.

The PC1 grouping explains variance among the samples by inversely correlating tritium, deuterium (d2H) and oxygen-18 (d18O) with pH and the cation ratio.  More positive PC1 values show that the individual has high d18O and d2H values, and low cation ratio and pH values. All d18O and d2H values are negative; values that are closer to 0 (“larger: for the purposes of this plot) are more enriched in the heavier isotope. PC2 is predominantly driven by an inverse correlation between well bottom elevation and temperature. PC3 inversely correlates tritium, specific conductance, and well bottom elevation with temperature, deuterium, and 18-oxygen.

Yellow/orange points on this map have more positive PC1 loadings. They have higher d18O and d2H values, and lower cation ratio and pH values. The lower cation ratio and pH values indicate that the sample is less chemically evolved based on the expected ion exchange reactions in the basalt. I found it interested that samples on that end of the spectrum were also more enriched in the heavier isotopes of hydrogen and oxygen.

Unfortunately I can’t seem to upload the maps for PC2 and PC3 for non-ions, I’m getting a message that all of my allowance for image uploads has been used up…

5: Critique

I’m pretty happy with what I put together using principal component analysis. In a perfect world I would have learned to do the spatial visualization elegantly in R instead of messing around with exporting data to Arc, and made my variable charts in R instead of exporting them to excel. However for the purposes of this project the less elegant method was quicker for me.

Courtney’s Spatial Problem

In the formula “How is the spatial pattern of A related to the spatial pattern of B via mechanism C?”, my research question for this class is made of the following parts:

  • A: ion and isotope concentrations in wells tapping the basalt aquifer
  • B: mapped faults
  • C: groundwater flow as determined by hydraulic conductivity of the geologic setting

This all adds up into:

“How is the spatial pattern of ion and isotope concentrations in wells tapping the basalt aquifer related to the spatial pattern of mapped faults via the mechanism of groundwater flow as determined by hydraulic conductivity of the geologic setting?”

This question might have thrown you, the reader, into the figurative deep end! For context, I’m studying the basalt aquifer in the Oregon portion of the Walla Walla Basin. This is located in north eastern Oregon, about thirty minutes northeast of Pendleton.

map showing the state of Oregon, with an inset map showing the study area

Figure 1: Walla Walla Sub-Basin Location

The wells that I am studying are drilled into the three most extensive formations within the Columbia River Basalts that are present in my study area: the shallow Saddle Mountains layer, the intermediate Wanapum Basalt , and the deepest Grande Ronde Basalt. The wells and faults that I am studying are predominantly on the transition area between the “lowland” areas where the basalts are covered by layers of sediment deposited in the ~15 million years since they were deposited and the upland areas where the basalt is exposed at the surface.

geologic map showing formations, folds, and faults in the entire Walla Walla Basin, highlighting my study area

Figure 2: Geologic map of the study area

Wells in this area are between 300 and 1,200 feet in depth, and primarily serve irrigation and municipal uses. Over the past 50 years there has been a noticeable downward trend in groundwater elevations in many of the wells. My research is part of a larger Oregon Water Resource Department project that seeks to better understand this groundwater system, faults and all. Faults add an element of the unknown to models of groundwater flow unless they are specifically studied, as they can formed either barrier or pathways for groundwater flow depending on their structures and characteristics. Faults with low hydraulic conductivity can block or significantly slow groundwater flow, while faults with higher hydraulic conductivity allow water to flow through them more easily. My research uses ion chemistry and isotope concentrations to characterize the path that groundwater has taken through the subsurface into the well.

Datasets:

A: In my research I have analytical data for 32 wells, whose XY locations were determined by field confirmation of my collaborators’ well log database. As groundwater is a 3D system, I have to consider Z values as well. The well depths and lithology information is also from my collaborators’ database, and was based on information of varying quality recorded at the time that the well was drilled. My analytical data provides a snapshot of water chemistry during the summer of 2018. I have only one temporal data point per well. At all 32 wells, I collected samples to be analyzed for pH, temperature, specific conductivity, oxygen isotopes 16 and 18, and hydrogen isotopes 1 and 2. At a subset of 18 of those wells I collected additional samples for tritium, carbon 14, and major ion analysis.

B: The shapefile of faults mapped at the surface was created by Madin and Geitgey of the USGS in their 2007 publication on the geology of the Umatilla Basin. There is some uncertainty in my analysis as far as extending this surface information into the subsurface. USGS studies have constrained proposed ranges of dip angles for the families of faults that I am studying, but not exact angles for any single mapped fault.

Figure 3: locations of wells that were sampled mapped with mapped fault locations.

Hypotheses:

Where faults act as barriers, I hypothesize that parameter values will differ in groups on either side of a fault. Specifically, a barrier fault might cause older, warmer water to rise into upper aquifer layers, and the downstream well might show a signature of more local recharge.

Where faults act as conduits, I hypothesize that water chemistry and isotopes of samples from wells on either side of the fault would indicate a relatively direct flowpath from the upstream well to a downstream well. Over a short distance, this means that ion and isotope concentrations would not differ significantly in wells across the fault.

Approaches:

I would like to use principal component analysis to identify grouping trends of the samples, and then map the results. Additionally, a bivariate comparison of wells on either side of the fault could be interesting? I would like to find some way to bring in distance from a fault into the model too.

Expected outcome: My output would be a mixture of statistical relationships and maps of those relationships.

Significance.  How is your spatial problem important to science? to resource managers?

The Walla Walla Subbasin’s basalt aquifers have recently been deemed over-allocated by the Oregon Water Resource Department (OWRD), and water managers are looking for methods to better regulate the aquifer when wells run dry. However, the faults are a big unknown when considering how to enforce the prior appropriation doctrine where junior permit holders are regulated off in times of water shortage. If a junior and senior water permit holder have wells that are separated by a fault, is it likely that stopping the junior permittee’s water use would actually result in more water available to the senior permit holder?

My approach is not novel in my scientific field. Several studies have evaluated similar parameters elsewhere in the Columbia River Basalts and also used statistical methods, but have not focused specifically on faults.

Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software

Arc-GIS: Highly proficient in Desktop and Pro

R – novice, can copy/paste/edit code to suit my basic needs

Python – novice, took a class two years ago but have forgotten much of it

Image processing – working knowledge of ENVI from GEOG 580, and of Gravit from GEOG 572

Other spatiotemporal analysis – I haven’t really worked with software besides ArcGIS or QGIS?