Background

The Willamette Valley of the Terminal Pleistocene (10,000+ years ago) was a wildly different place than the valley we are used to seeing today. There were massive animals roaming the valley floor, which was a lot wetter, with more marshes, bogs, and lakes. Oak savannah and forested swaths of land made their way towards the Willamette River… This wonderful landscape was likely also inhabited by humans, though that is a very difficult question to explore.

In order to find out where in the Pleistocene Willamette Valley people might have lived, we must first understand the sediments that lay under our feet here in the valley, and the best way to do that is by extracting it from the ground and analyzing it.

This project is a part of a bigger picture study that seeks to use sediment cores extracted from a buried peat bog found at Woodburn High School in Woodburn, Oregon and identify the sediments buried within that are of the appropriate age and environment to find both potential Pleistocene aged archaeological sites as well as more evidence of Pleistocene megafauna.

 

Research Question

For the purposes of this project, I am seeking to find a different method for identifying stratigraphic breaks in a sediment profile. The most popular methods for identifying stratigraphy is through visual examination, multivariate statistical analysis, or by texturing the sediment.

Using x-ray fluorescence geochemical data, and Wavelet Analysis, a method typically used for studying time-series data, is it possible to determine the site stratigraphy using one or two different variables?

 

The Data

The dataset consists of XRF data taken at 2mm intervals, from 65 1.5-meter core samples. These cores come from 14 different boreholes covering the majority of the defined study area which is approximately a 200×50-meter area. The cores were extracted using a Geoprobe direct-push coring rig.

 

The Site:

1

The Geoprobe in action at the site:

2

The core samples were halved and run through an iTrax core scanning unit. The iTrax scans the cores using an optical camera, a radiograph (similar to medical x-ray), and an x-ray fluorescence scanner, which collects geochemical data consisting of 35 different element counts at 2mm intervals. The data is organized into 14 CSV files containing the XRF results.

 

The iTrax:

3

Hypothesis

Using wavelet analysis, significant increases and decreases of geochemical properties can indicate where stratigraphic breaks in the sediment occur. This pattern should be repeatable across all of the gathered cores.

 

Approach

The method I chose to analyze my core data and attempt to break apart the stratigraphy was Wavelet Analysis using an R package called “WaveletComp”.

‘WaveletComp” takes any form of continuous data, typically time-series data, spatial data in this case, and uses a small waveform called a wavelet to run variance calculations along the dataset. The resulting output is a power diagram, which shows (in red) the locations along the dataset where there is a great change in variance. A cross-wavelet power diagram can also be generated. This can indicate when two different variables are experiencing rises and/or drops at the same time.

 

Example of a wavelet.

2

There are two equations used when generating a wavelet power diagram…

3

The above equation uses the dataset to calculate the appropriate size of the wavelet according to the number of points in the dataset.

4

The above equation uses the wavelet to run variance calculations across the dataset and output the power diagram.

 

Using the ‘WaveletComp” package in R, I processed 5 different core scans. In order to properly conduct the analysis, elements had to be selected in order to do both univariate and bivariate analysis. There were a variety of ways that I could have selected the data, but ultimately, in the test sample, I chose to look at the elements that had the most obvious changes, aluminum and iron.

The details on how to actually run the “WaveletComp” package can be found in my wavelet analysis tutorial.

 

Results

 

After all of the tests were run, the resulting wavelet power diagrams were placed alongside line graphs of the element that was run, as well as a cross-wavelet power diagram, which indicates when the two selected elements change at the same time.

The wavelet power diagrams show significant changes in the waveform, as indicated by the red (high variance). The blue in the power diagram shows low variance. In the cross-wavelet power diagram (the one on the far right), the arrows indicate if the waveforms of the two elements are in-phase (pointing left) or out of phase (pointing right) with each other.

In order to identify stratigraphic breaks, I looked at the “taller” red plumes in the wavelet power diagrams, which indicated a significant change in the amount of an element present, either a great increase or a great decrease. This result was compared to the graph, as well as the image of the core (for visual identification of changes in color or texture). The spots that have the high plumes, and correspond with a significant color or texture change in the image are presumably the stratigraphic breaks.

Each of the five cores showed promise that we can identify stratigraphy using wavelet analysis. The two most significant cores are discussed below:

4

The results for the first core segment shows various distinct changes as indicated by the tall red plumes in the power diagrams, but there is one major problem with the results. The graphed data shows areas where there is zero data at points, as indicated by the drops to the very bottom of the graph. These spots also have the most distinct plumes (for the most part), but about 2/3 of the way down there is a distinct plume that, while contains zero values, also is a spot with a distinct texture change in the sediment. At this point the sediment changes texture and feel entirely.

5

The results of the second core sample were more interesting, as there were no zero values. The Al+Fe cross-wavelet diagram shows significant plumes where almost all of the significant looking color changes in the stratigraphy are in the profile.

6

7

8

The second core was the most interesting of the analysis, due to the lack of zero/null data. With a little bit of data management, the zero data can be reduced by interpolating some of the values and re-running the data with the interpolated data, That should help reduce the effects of the zero data as demonstrated in the above results.

 

Significance

Wavelet analysis is an excellent tool for observing patterns in spatio-temporal data that is sequential. As for the significance of this project…

Once all of the observed kinks are fixed in the data, this could serve as a new method for identifying changes in stratigraphy. This could be a good method to identify stratigraphy in sediment cores that are extremely similar in color or texture.

 

What I learned about the software

R is an excellent tool for conducting many different kinds of spatio-temporal analysis. From running wavelet analysis to regression, and really any other tool that ArcGIS has to offer.

ArcGIS is an excellent tool for data visualization, but it is a very finicky program that has to

 

What I learned about statistics.

Statistical applications in geospatial analysis are very important to understanding even the smallest of changes, such as an increase or decrease in iron in a sediment core.

 

The analysis portion of this project did not prove to be effective, and ultimately resulted in no results. Let me use this tutorial to walk you through a cautionary tale about being just a little too determined to make an idea work with the dataset that you have.

 

The Question:

After doing a lot of research on both early archaeological sites, and sites that contain Pleistocene megafauna in the Willamette Valley, a few patterns seemed to emerge. Megafauna sites seemed to occur in peat bogs, and all of the earliest archaeological sites occurred on the margins of wetlands. This led me to begin to ponder that maybe there is a connection between the soil type, pH, and other factors across all of the known sites. What variables might be useful to predicting the locations of other yet to be discovered archaeological and megafauna sites throughout the valley?

 

The Tools:

In order to conduct this exploration, I decided to use the tools that were built into ArcGIS. Hotspot analysis and regression were going to be the main two tools that were going to be used.

For the data, I found a SSURGO dataset that was in vector format. It contained polygons of all of the mapped soil units in the valley, as well as a variety of factors related to slope, parent material, order, etc. Eventually I switched gears and found another SSURGO dataset that was in raster format and contained a whole lot more data, hoping that this change in dataset would make the analysis much easier.

 

The Steps:

The first step that I took when conducting this analysis was mapping different variables out and looking at them comparatively, to see if there were any obvious patterns to emerge. Three different soils popped up as looking like they were important when considering the late Pleistocene in the Willamette Valley.

 

The mapping revealed that there were three soil types that seemed to appear at most of the known sites.

 

Below is a map of all of the known Pleistocene megafauna and early Holocene archaeological sites in the valley.

 

11

There were 3 major soil types that emerged associated with local sites. The Labish, Bashaw, and McBee soil types.

12

The Labish Soil was especially interesting, as it only seemed to occur at the major peat bearing sites in the valley, most of which were drained lakes that are currently used for crops.

 

After reading about the nature of soil pH in wetland deposits, I began to hypothesize that pH would have been an important variable in the soils that I had identified, and wanted to use this knowledge to find more sites through hotspot analysis and/or regression analysis.

 

 

The Results:

 

The results are the toughest part to discuss, as there was not much to show for results.

Many attempts at successfully running regression analysis were made, using a wide variety of different combinations of data, but all of it returned an error of perfect multicolinearity, resulting in fails across the board. The analysis was attempted using both the vector and raster form of the data, using built-in pH data, pH data that was acquired elsewhere and added to the data, as well as combinations of variables.

As I began to explore the dataset further, I realized that the data, while initially appearing to be incredibly varied, was in fact quite the same. I mapped out the soil orders and Great Groups in the valley and realized that each of the maps looked strikingly similar, which was telling me that (as was mentioned in class), all of the data was likely extrapolated from a few key points.

 

Soil Order:

13

Soil Great Group:

14

Aside from a few differences, both of the maps are extremely similar, which is telling me that this data is more than likely, as mentioned above, extrapolated across a large landscape.

 

This realization made me doubt the pH data as well, so I mapped that out as well.

 

Soil pH Map:

15

The valley soils appear to be fairly neutral, and only vary between 5.7 and 6.6

This would make it very difficult to use some sort of exploratory statistical analysis on this dataset, as there wasn’t much variability.

In order to look at how the pH was distributed throught the valley, I ran a hotspot analysis as well as a Moran’s I analysis.

pH Hotspot Analysis Map:

16

pH Moran’s I Analysis Results:

17

As you can see, the data is extremely clustered, especially in my particular areas of interest, which are the valley floor.

 

 

Was this useful?

This analysis was useful, but for a different reason than was expected.  The SSURGO dataset is not the best tool for soil landscape analysis at a smaller scale. Throughout the class, I have seen other statewide projects that were a lot more successful due to higher variability in soils between the east and west sides of the state.

I became a tad too determined to run this kind of analysis, and the results were completely inconclusive in that respect, but in the end, the most beneficial part of the analysis was figuring out that there are likely connections between my sites of interest. In order to investigate these connections, physical testing is likely the most reliable source, since the SSURGO data is not reliable for this purpose.

Also, don’t rely on your data too much. It might mess with your head a bit!

Wavelet Analysis: A Tutorial

 

Woodburn High School in the northern Willamette Valley, Oregon, contains evidence of an extensive peat bog as well as evidence of extinct Pleistocene megafauna. In October of 2015, sediment cores were extracted from the site in order to better understand the underlying sediment at the site, and find the sediment that is of the right age and type to possibly contain evidence of human occupation.

1

Aerial photo of the study area, with sample locations marked with orange dots.

 

In order to further explore the project, a better understanding of wavelet analysis must be established. By testing a sample of geochemical data extracted from a core.

 

The Question:

 

Is wavelet analysis an appropriate tool for geochemically identifying stratigraphic breaks in sediment cores?

 

The Tools:

 

To conduct this analysis, I used R as well as the R ‘WaveletComp’ package for the wavelet analysis, and ‘ggplot2’ in order to graph the geochemical data.

 

‘WaveletComp” takes any form of continuous data, typically time-series data, spatial data in this case, and uses a small waveform called a wavelet to run variance calculations along the dataset. The resulting output is a power diagram, which shows (in red) the locations along the dataset where there is a great change in variance. A cross-wavelet power diagram can also be generated. This can indicate when two different variables are experiencing rises and/or drops at the same time.

 

2

Example of a wavelet.

 

There are two equations used when generating a wavelet power diagram…

3

The above equation uses the dataset to calculate the appropriate size of the wavelet according to the number of points in the dataset.

4

The above equation uses the wavelet to run variance calculations across the dataset and output the power diagram.

 

 

The Steps:

 

After the appropriate package is loaded into R, and the dataset is uploaded, and the generation of the power diagram and cross-power diagram consists of just a small block of code…

 

Code snippet for the power diagram:

Fewhs010101 = analyze.wavelet(WHS010101, “Fe”,

loess.span = 0,             # Detrending (Kept defaults)

dt = 1,                     #Time series (Kept defaults)

dj = 1/250                  # Resolution along period axis (kept defaults)

make.pval = T,              # Draws white lines that indicate significance (true or false)

n.sim = 10                  # number of simulations

wt.image(Fewhs010101, color.key = “quantile”, n.levels = 250,

legend.params = list(lab = “WHS 010101 Wavelet Power Fe”, mar = 4.7))

 

Code Snippet for a cross-power diagram:

AlFewhs010101 = analyze.coherency(WHS010101, c(“Al”,”Fe”),

loess.span = 0,

dt = 1, dj = 1/100,

make.pval = T, n.sim = 10)

 

wc.image(AlFewhs010101, n.levels = 250,

legend.params = list(lab = “WHS 010101 cross-wavelet power levels – Al/Fe”))

 

 

The output of the power diagram was then compared to line graphs generated using the ggplot2 package.

 

 

The Results:

 

The results were quite interesting, as you can clearly see three different geochemical breaks, as illustrated with the three red plumes that are rising from the bottom of the diagram. The color red indicates that at that particular “time,” there is a significant edge or change in the waveform. This is illustrated by comparing to the line graph that is below. There are “red plumes” present at all of the significant changes in the waveforms on the graph. This tells us that these locations along the transect should be considered “hotspots” for stratigraphic change.

 

5

Wavelet power diagram of Aluminum.

6

Example of graphed aluminum and Sulphur data for the core.

 

Was this useful?

For my project, the analysis seemed to show that it is very possible to spot major changes in geochemistry across a transect. This will be further explored in my forthcoming final analysis post.

As for other projects, this method could be used to spot unseen patterns in the changes of sea surface temperature over time, or changes in frequencies of oxygen isotopes, or any other data that is presented in a time-series or in equidistant measures across a landscape.

 

Project Abstract (Taken from a recent conference poster):

The Willamette Valley during the Terminal Pleistocene was an environment in constant flux, creating a changing world for the early inhabitants of the Pacific Northwest. The valley floor contains an extensive record of Pleistocene ecology and archaeology; however, the information is locked within a complex stratigraphic sequence. Using a Geoprobe direct push coring rig, 13 sediment cores were extracted from surficial deposits in the Mill Creek watershed at Woodburn High School. The core samples were analyzed on Oregon State University’s Itrax core scanner, returning high-resolution optical imagery, radiograph images, and x-ray fluorescence (XRF) data. The XRF data is used to construct a chemostratigraphic profile of the study area in order to define and model the distribution of sediments potentially related to late Pleistocene-aged archaeological sites.

Research Question, etc:

I am seeking to explore methods of constructing chemostratigraphic frameworks of sediments at both archaeological and non-archaeological sites. The method that is most typically used to define chemostratigraphy at archaeological sites is portable x-ray fluorescence of previously described stratigraphy, and using multivariate statistics to separate the strata by chemistry. Using an Itrax Core Scanning machine, sediment cores extracted from a drainage at Woodburn High School were scanned and continuous high-resolution x-ray fluorescence (XRF) data was acquired. Using wavelet analysis, I hope to be able to define the site stratigraphy and use it to construct a 2D and 3D representation of the subsurface landscape.

Map
Woodburn High School study area.

Project Dataset:

The dataset consists of XRF data taken at 2mm intervals, from 65 1.5 meter core samples. These cores come from 14 different boreholes covering the majority of the defined study area which is approximately a 200×50 meter area. The data is organized into 14 CSV files containing the XRF results.

 

Hypothesis:

Through preliminary testing I have seen potential in using this method to successfully identify stratigraphy. If the result of the preliminary test translates across all 14 boreholes, the construction of landscape wide stratigraphic profiles from the borehole samples and wavelet analysis is very likely.

 

Approaches:

Throughout the term, and through the process of conducting analysis of the Woodburn sediments, I hope to learn how to better utilize and interpret wavelet analysis data, as well as digitally construct 2D and 3D stratigraphic profiles using interpolation methods.

Breaks in stratigraphy can be shown clearly through changes in color or texture, and multivariate techniques have been very useful to identify them. This method has proven useful to confirm the chemostratigraphy of a site when the XRF measures have an attributed strata. Wavelet analysis allows the user to see possible changes in geochemistry, which gives way to possibly identifying geochemical breaks in strata from borehole data that does not contain established stratigraphic names and boundaries.

In order to conduct the analysis, elements had to be selected in order to do both univariate and bivariate analysis. There were a variety of ways that I could have selected the data, but ultimately, in the test sample, I chose to look at the elements that had the most obvious changes. This allowed me to really understand how wavelet analysis works versus a regular line graph. For the final analysis, I will look at similar completed work, and select the best elements to conduct bivariate analysis with for XRF based mineral studies.

 

Expected Outcome:

Visually, I would like to create a stratigraphic profile for each of the four transects at the site, as well as a 3D representation of the site using ArcScene. As for the data, I would like to create a type stratigraphy that archaeologists can reference in order to help find early archaeological sites in the Willamette Valley  

 

Significance:

The results of this project will hopefully help archaeologists understand the stratigraphy and possibly the environmental conditions in the Woodburn, Oregon area, and possibly the Willamette Valley. The sediments buried in this site could contain clues into which sedimentary deposits that archaeological sites could be hidden in, or at least hidden near.

My Level of Preparation:

I am pretty knowledgeable in ArcGIS and the rest of the Arc/ESRI suite of programs. My python skills are average, with better skill in ArcPy. As for R, I have taken courses that deal with it, and am steadily improving my skills.