Research Question
- “How is the spatial pattern of ion and isotope concentrations in wells tapping the basalt aquifer related to the spatial pattern of mapped faults via the mechanism of groundwater flow as determined by hydraulic transmissivity of the geologic setting?”
Description of dataset that I examined
- A: In my research I have analytical data for 31 wells, whose XY locations were determined by field confirmation of the Oregon Water Resource Department (OWRD) well log database. As groundwater is a 3D system, I have to consider Z values as well. The well depths and lithology information are also from the OWRD database. My analytical data provides a snapshot of water chemistry during the summer of 2018. I have only one temporal data point per well. At all 31 wells, I collected samples to be analyzed for pH, temperature, specific conductivity, oxygen isotopes 16 and 18, and hydrogen isotopes 1 and 2. At a subset of 18 of those wells I collected additional samples for tritium, carbon 14, and major ion analysis.
- B: The shapefile of faults mapped at the surface was created by Madin and Geitgey of the USGS in their 2007 publication on the geology of the Umatilla Basin. There is some uncertainty in my analysis as far as extending this surface information into the subsurface. USGS studies have constrained proposed ranges of dip angles for the families of faults that I am studying, but not exact angles for any single mapped fault.
- C: results of pumping interference tests involving 29 wells, 12 of which I had chemical data for. The data was collected by the OWRD in 2018 and 2019.
Hypotheses
- Faults can act as conduits or barriers to groundwater flow, depending on how their transmissivity compares to the transmissivity of the host rock.
- I hypothesize that clusters of similar chemical and isotopic properties of groundwater can indicate a shared aquifer unit/compartment, and that if faults separate clusters then the fault is influencing that difference in chemical/isotopic signatures. If the fault is between two clusters, I hypothesize that it is acting as a barrier. If it crosses through a cluster, I hypothesize that it acts as a conduit.
- Where faults act as barriers, I hypothesize that parameter values will differ in groups on either side of a fault. Specifically, a barrier fault might cause older, warmer water to rise into upper aquifer layers, and the downstream well might show a signature of more local recharge.
- Where faults act as conduits, I hypothesize that water chemistry and isotopes of samples from wells on either side of the fault would indicate a relatively direct flowpath from the upstream well to a downstream well. Over a short distance, this means that ion and isotope concentrations would not differ significantly in wells across the fault.
- My hypotheses depend on a “barrier” fault damming groundwater flow up-gradient of the fault, and compartmentalizing local recharge on the down-gradient side. This hypothesis is only relevant if the fault is roughly perpendicular to the flow direction, and so disrupting transmissivity between a recharge zone and the wells. If a fault that separates two wells is parallel to the flow direction and there is no obstacle between the wells and upstream recharge areas, then the fault might indeed limit communication between the wells but they will have similar chemical signatures. Wells separate by this second kind of fault barrier would be better evaluated by a physical test of communication, such as a pumping interference test.
Analysis Approaches
- Principal component analysis: used to simplify the multivariate data set (19 variables!) into variable relationships that could represent aquifer processes
- Analysis of PCA results compared to distance from a flow path
- Interpolation of well water levels classified by well stratigraphy to estimate a potentiometric surface and evaluate groundwater flow directions.
- Raster calculations to compare flow direction to fault incidence angle
- Measuring distance from each well to the nearest fault along the flow path
- simple linear regression, comparing Non-ion PC1 score of a well with its distance from a fault.
- Two-sided T-tests comparing distance between wells, presence of a fault, and pumping interference test communication between wells
- ANOVA F-tests comparing chemical and isotopic variance within groups of wells that communicate with each other and between those groups.
Results
- Principal component analysis – Patterns of variation between wells are statistically explained primarily by total ion concentration, a relationship between chemical evolution from ion exchange and decreasing stable isotope ratios, and the combination of well depth and water temperature. Moran’s I indicates that only Non-ion PC2 is spatially clustered, while the other PC models have a distribution not significantly different than random. The other PC models are useful to understand the groundwater system, but not specifically to analyze clustering correlated to faults.
- Interpolation of water level, and comparison of fault incidence angle with flow direction, indicates faults that are and are not able to be tested by my hypotheses.
- Analysis of PCA results compared to distance from a fault along flow path – some wells that are “within” a fault zone have very old signatures and others have very young signatures. This could be related to the angle of the dip of the fault and the accuracy of mapping compared to the depth of the well. I hypothesize that the wells that are in the fault zone but have high PC1 scores are on the up-gradient side of the fault where older water is upwelling along a barrier. Wells in fault zones with low PC1 scores could indicate wells open to downgradient areas of the fault, where vertical recharge through the fault damage zone is able to reach the well.
- Returning to the conclusions I wrote in that blog post after I found improved stratigraphic data, I’m not sure if I can make conclusions other than those about the wells are that mapped as “inside” a fault. Several wells that are closer to faults are also open to shallower aquifer units, and so the effect of lower PC1 scores closer downgradient to faults might be confounded by lower PC1 scores caused by vertical inflow from the sedimentary aquifer and upper Saddle Mountain aquifer.
- Two-sided T-tests comparing distance between wells, presence of a fault, and communication between wells show that the presence of a fault has a greater effect on communication than the distance between the wells.
- ANOVA F-tests comparing chemical and isotopic variance within groups of wells that communicate with each other and between those groups – stable isotopes and specific conductivity both show more variation between well groups than within well groups.
- Not covering in these blog posts, I also ran Moran’s I on my inputs to see which ones are clustered and so might be more related to horizontal grouping factors (such as faults) than vertical grouping parameters (such as stratigraphic unit). Of the PCA and individual variables, only d18O, d2H, and Non-ion PC2(combination of well depth and water temperature) were clustered. The other PCA models, temperature, pH, and specific conductivity were not significantly spatially clustered.
Significance – Groundwater signatures are related to faults agree/disagree with past understandings of differences between wells in the region, and can inform well management. If a senior permit holder puts a call on the water right and asks for junior users to be regulated off, it would not help that senior user if on of those junior permit holders’ wells is not hydraulically connected to the senior users.
- More wells would need to be sampled to be better able to disentangle the effects of faults from the effects of well stratigraphy.
My learning – I learned significantly more about how to code and troubleshoot in R. Additionally, I learned about the process of performing spatial clustering analysis in ArcGIS.
What did I learn about statistics?
- PCA was completely new to me, and it’s a cool method for dealing with multivariate data once I dealt with the steep learning curve involved in setting it up and interpreting the results. It was useful getting more practice performing and interpreting t-tests and Anova F-tests. I had not used spatial clustering before, and learning how to apply it was interesting. It gave me a much more concrete tool to try to disentangle the patterns in my effectively 3D data on X,Y plane, as opposed to the Z direction.