Category Archives: 2019

Spring 2019 blog posts

My Spatial Problem – Elevated Blood Lead Levels in Bangladeshi Children: What is to blame?

Background:

No level of lead exposure is considered safe, and humans have experienced its toxicity to nearly every organ system, especially the central nervous system (1,2).  Exposures to lead, especially for the developing neurological system of children, are known to be hazardous and greatly impact children (2).

Environmental contamination is a widespread public health issue throughout the country of Bangladesh. Within the 1990s, widespread arsenic poisoning was described in populations which used groundwater wells for drinking water and were contaminated with naturally-occurring arsenic (3). In addition, Bangladesh is a developing country with increasing industry and continued burning of gasoline pointing to high exposures of lead through air pollution (4). Children also make up about 13% of the labor force in Bangladesh which include pottery glazing, lead melting, welding, car repair, and ship breaking which are all known exposures of lead (5). Based on historical air pollution and industries within urban areas of Bangladesh, the exposures to lead and known elevated childhood blood levels alarm health authorities and more information is needed as to determine exact exposures within communities.

I will utilize a 16-year developed maternal cohort study (R01ES015533 PI:Christiani/CoI Kile; P42ES16454: Bellinger/CoI: Kile; R01ES023441,PI: Kile) in Pabna and Sirajdikhan Upazilas in Bangladesh to examine the possible spatial relationships of lead exposures. Children born from the participating mothers in the maternal study were followed from early childhood and were tested for blood lead levels at 4 to 5 years old. Many children within this cohort showed elevated blood lead levels above the U.S. recommendation level of 5 ug/dL (Table 1).

Table 1: Summary results of blood lead levels (ug/dL) of children aged 4-5 years in Pabna and Sirajdikhan (n=348)

Location

Minimum Mean Maximum Above 5 ug/dL and percentage
Combined

0.00

3.86 19.6

165, 47%

Pabna (n=183)

0.00

1.32 10.8

51, 28%

Sirajdikahn (n=165) 0.00 6.67 19.6

114, 69%

Research question:

What are the potential spatial relationships of blood lead levels in children aged 4-5 years old within two areas in Bangladesh and association with possible sources of exposure from high traffic or urban areas?

Dataset analyzing:

Blood lead concentrations (ug/dL) of children at ages 4-5 years old only taken at one timepoint with coordinates of their homes (n=348). There is one other categorical variable of if the children are above or below the 5 ug/dL US recommendation for lead exposure. The resolution is fine scale within meters of one another over the span of square kilometers. I will also be employing base maps of the two districts where the children live. This will allow network review for the major traffic and highway areas and distance to urban centers. I have a slight hesitation for the level of information I will have within the country of Bangladesh for understanding the street network. My pilot data from analysis in R made it difficult to gather these data. I am looking to import the data into Arc for more extensive data analysis.

Hypotheses:

I hypothesize that the blood lead levels of the children are spatially correlated and increased blood lead levels in either specific hot spots and/or locations closer to roads and/or urban centers will be due to inhalation and dermal exposure from air pollution.

Approaches:

I would first like to perform spatial autocorrelation to determine if the blood lead levels of children are spatially correlated.  Secondly, I would like to employ the hot-spot analysis tool from Arc to determine if within the sample of children there are specific hotspots of higher blood lead levels. We do know that between the two districts of samples taken, the district closer to urban landscape has higher, more prevalent levels of lead in children (Table 1). To try and pinpoint more specific exposures, I would like to intersect the locations of the children’s homes with a buffer (100 m) from major highway and traffic areas. I would then aggregate the intersected homes and determine if those blood lead levels are higher comparatively to the children that live greater than 100 m. The first attempt would be crude analysis of within or outside the buffer, but if we do find that individuals within the buffer have significantly higher blood lead levels, I would like to see if there is a drop off of between 100-250 m, 250 – 500 m, and 500 m + in distance from high traffic roadways. Lastly, if we do not find trends with distance to roadways and blood lead levels, I will overlay base maps of industry related areas and identify if they overlap with hot spots within our spatial spread of the blood lead levels.

Expected outcomes:

  1. Spatial autocorrelation: I hope to describe the blood lead level data as spatially autocorrelated by producing linear plots as a visual representation of their autocorrelated relationships
  2. Hot spot analysis: From Arc, I will produce maps through the hot-spot analysis tool to hone into different areas of interest that might have the highest levels comparatively. This map would be useful to understand blood lead level relationships within the two different districts, and if there are areas of interest for further analysis.
  3. Distance to roadways: From Arc, I want to produce maps that show a 100 m buffer around major roadway systems and intersection of locations of children’s homes from the cohort. I would then like to graphically produce a simple boxplot of blood lead levels within the buffer and outside of the buffer zone to compare the mean blood lead levels. If I do find differences, I would like to produce another map of the gradient change of going farther away from high traffic roadways and blood lead levels (further buffer/intersection analysis).
  4. Intersection of Hot Spots and Industry: I would like to produce a map similar to the hot-spot analysis which overlays the areas of known industry from a base map with the potential hot-spots of children’s homes.

Significance:

In a 16+ year cohort understanding the consequences of chronic arsenic exposures, my work is significant to help communities better protect children from undue burden of environmental pollution. I hope to bring a better understanding to the communities as to why their children have high blood lead levels compared to US averages. My research will help explain if the areas the children live in are correlated with increased exposures and lead to more help public health actions on how to limit lead exposures.

Level of preparation:

a) Intermediate knowledge of ArcGIS Pro

b) I will concurrently be taking a GIS programming course in Python, and I have taken intro Python coding coursework.

c)I have completed coursework (GEOG 561) programming in R with these data to understand spatial relationships. I also have extensive statistical modeling experience in R.

Works Cited:

  1. Tong S, Schirnding YE von, Prapamontol T. Environmental lead exposure: a public health problem of global dimensions. Bull World Health Organ. 2000;78:1068–77.
  2. WHO | Lead [Internet]. WHO. [cited 2019 Mar 17]. Available from: http://www.who.int/ipcs/assessment/public_health/lead/en/
  3. Smith AH, Lingas EO, Rahman M. Contamination of drinking-water by arsenic in Bangladesh: a public health emergency. Bull World Health Organ. 2000;78(9):1093–103.
  4. Kaiser R, Henderson A K, Daley W R, Naughton M, Khan M H, Rahman M, et al. Blood lead levels of primary school children in Dhaka, Bangladesh. Environ Health Perspect. 2001 Jun 1;109(6):563–6.
  5. Mitra AK, Haque A, Islam M, Bashar SAMK. Lead Poisoning: An Alarming Public Health Problem in Bangladesh. Int J Environ Res Public Health. 2009 Jan;6(1):84–95.

Faye Andrews, MPH

andrewsf@oregonstate.edu

Deaggregation of infrastructure damages and functionality based on a joint earthquake/tsunami event: an application to Seaside, Oregon.

Research Question and Background

The Pacific Northwest is subject to a rupture of the Cascadia Subduction Zone (CSZ) which will consequently result in both an earthquake and tsunami. While all communities along the coast are vulnerable to the earthquake hazard (e.g. ground shaking), low lying communities are particularly vulnerable to both the earthquake as well as the subsequent tsunami. Completely mitigating all damage resulting from the joint earthquake/tsunami event is impossible, however, understanding the risks associated with each hazard individually can allow community planners and resource managers to isolate particularly vulnerable areas and infrastructure within the city.

The city of Seaside, Oregon is a low-lying community that is subject to both the earthquake and tsunami resulting from a rupture of the CSZ. The infrastructure at Seaside can be divided into four components: (1) buildings, (2) electric power system, (3) transportation system, and (4) water supply system. Similarly, the hazards can be viewed jointly (both earthquake and tsunami), as well as independently (just earthquake or tsunami).

Within this context, I’m particularly interested in looking at how the spatial pattern of infrastructure damage and functionality is related to individual earthquake and tsunami hazards via ground shaking and inundation respectively. Furthermore, I’m interested in looking at how these spatial patterns change as the intensity of the hazard increases.

Description of Dataset

The dataset I will be analyzing consists of two components: (1) spatial maps, and (2) infrastructure damage and functionality codes. Part of this analysis will be merging these two components to spatially view the infrastructure damage and functionality.

The spatial maps consist of:

  1. Building locations (represented as tax lots)
  2. Hazard maps: earthquake ground shaking and tsunami inundation hazard maps

The infrastructure damage and functionality codes implement Monte-Carlo methods to probabilistically define damages, losses, and connectivity. The four infrastructure codes consist of:

  1. Buildings: expected damage and economic losses to buildings.
  2. Electric power system: a connectivity analysis of each building to the electric substation. There is one electric substation within Seaside.
  3. Transportation system: a connectivity analysis of each building to critical infrastructure. Critical infrastructure at Seaside consists of two fire stations and one hospital.
  4. Water supply system: a connectivity analysis of each building to their respective pumping station. There are three water pumping stations within Seaside, and each building is assigned to a single pumping station.

Hypotheses

I hypothesize that the infrastructure damage is not spatially variable for the earthquake hazard, however it will be for the tsunami hazard (e.g. distance from coast). The relative damages due to tsunami will also increase as the intensity of the hazard increases.  That is, for small events, the damages will be dominated by earthquake, whereas for larger events, the damages will be dominated by the tsunami.

Approaches

While color-coordinating tax-lots based on economic losses provides a means to visualize damages throughout a study region, I am interested in learning about kernel density estimation and hot spot analysis to identify vulnerable regions (not just individual buildings). I am also interested in learning about different spatial network analysis methods, as only connectivity analyses within the infrastructure networks (electric, transportation, and water) have been considered so far.

Expected outcome

I’m hoping to produce maps showing how damages and economic losses relate to both joint hazards (earthquake and tsunami), as well as independent hazards (just earthquake or tsunami). I would also like to produce maps showing the connectivity of individual tax-lots to critical infrastructure. Furthermore, I would like to investigate visualizing both the economic losses and connectivity analysis through color-coordinating tax-lots, kernel density estimation and hot-spot analysis.

Significance

The ability to spatially isolate vulnerable areas will allow community planners and resource managers a means to better prepare mitigation plans. Deaggregating the damages and losses by infrastructure and hazard will isolate the relative importance of each, and can assist in mitigation measures. For example, identifying that the earthquake is the dominating force in producing building damages within a specific region, planners and resource managers can support retrofit options for homeowners within that region.

Level of preparation

  1. Arc-info: novice
  2. ModelBuilder and/or GIS programming in Python: Although I haven’t done GIS programming in Python, I am highly proficient in Python and am comfortable working with GIS data. Learning how to merge python and GIS should not be difficult.
  3. R: novice
  4. Image processing: novice
  5. Other relevant software: I’m proficient in QGIS.

Examining the Spatial Relationships between Seascapes and Forage Fishes

Description of Research Question

My objective is to study the spatial relationships between sea-surface conditions and assemblages of forage fish in the California Current System from 1998 to 2015. Forage fish are a class of fishes that are of importance to humans and resource managers, as they serve as the main diet for economically and recreationally valuable large-game fishes. Using a combination of remotely sensed and in-situ data, sea-surface conditions can be classified into distinct classes, known as “seascapes,” that change gradually over time. These seascapes, which are based on a conglomeration of measurable oceanographic conditions, can be used to infer conditions within the water column. My goal is to determine if any relationship exists between forage fish assemblages and certain seascape classes by examining the changes in the spatial patterns related to each over time. Forage fish assemblage may be related to seascapes as certain seascape classes may correspond to physical (temperature) or biological (chlorophyll concentration) conditions, either on the surface or in the water column, which happen to be favorable for a specific species or group of species.

My question can be formatted as: “How is the spatial pattern of forage fish assemblage in the California Current System related to the spatial pattern of seascapes based on the sea-surface conditions used to classify the seascapes (temperature, salinity, and chlorophyll)?

Description of Data

Midwater trawls have been conducted annually by the National Oceanic and Atmospheric Administration’s (NOAA) Southwest Fisheries Science Center (SWFSC) in an attempt to monitor the recruitment of pelagic rockfish (Sebastes spp.) and other epipelagic micronekton at SWFSC stations off California. The trawls have informed a dataset that represents overall abundance of all midwater pelagic species that commonly reside along the majority of the nearshore coast of California from 1998 to 2015. Each trawl contains both fish abundance, recorded in absolute abundance, and location data, recorded in the form of latitude and longitude. The dataset also includes a breakdown of species by taxa, which will be used to determine if a fish is a “forage fish.”

Seascapes have been classified using a combination of in-situ data (from the trawls) and remotely sensed data from NASA’s MODIS program. Seascapes were classified using the methods described in Kavanaugh et al., 2014 and represent the seascape class in the immediate area that each trawl occurred. Seascapes are classified at 1 km and 4 km spatial resolution and at 8-day and monthly temporal resolution. Each seascape has been assigned an ID number which is used to identify similar conditions throughout the dataset.

The map below shows the locations of every trawl over the course of the study.

Figure 1: Map showing all trawl sites contained in the dataset. Trawls occurred at a consistent depth using consistent methods between and including the years of 1998 and 2015

Hypotheses

I hypothesize that any measurable spatial changes in the spatial extend of certain seascape classes will also be identifiable in the spatial variability of forage fish assemblage over time. Preliminary multivariate community structure analysis has shown some statistically significant relationships between certain species and certain seascape classes using this data. If spatial patterns do exist, I expect there to be some relationship between the surface conditions and the fish found at depth of the midwater trawls.

Hypothesis: I expect the spatial distribution of forage fish species to be related to spatial distribution of seascape conditions based on the variables used to classify the seascapes (temperature, salinity, chlorophyll).

Potential Approaches

I hope to utilize the tools within both R and the ArcGIS Suite of products to identify and measure spatial patterns in both seascape classes and forage fish assemblages over the designated time period. I also aim to run analyses to determine if any relationship exists between the variability in spatial extent of each variable. These analyses will be used to supplement the previously completed multivariate community structure analyses done on these data.

For Exercise 1, I will identify and test for the spatial patterns of the forage fish family Gobiidae (Goby) and Seascape Class 10, as initial indicator species analyses indicated that there may be a relationship between the two. In Ex. 2, cross-correlation and/or GWR will examine relationships between these patterns.

Expected Outcome/Ideal Outcome

Ideally, I would like to determine and define the relationship between seascape classes and forage fishes in the California Current System over the designated period of time. Any sort of definitive answer, positive, negative, or none, provides valuable insight into the relationships between this remotely sensed data and these fishes. If that claim could be bolstered by a visual which outlines the relationship between my variables (or lack thereof), that would be icing on the theoretical cake.

Significance of Research

Measuring the predictability of forage fish assemblage has wide-ranging impacts and could be found useful by policymakers, fishermen, conservationists, and even members of the general public. Additionally, this research can be used to underscore the importance of seascape-based management or seascape approaches to ecology or management. This research could also be used as inspiration for future studies about different species, taxa, or geographic locations.

Level of Preparation

I completed a minor in GIS during my undergraduate studies, but have not had to utilize those skills for about 15 months. After some time, I believe that I will be extremely comfortable using the software. I have basic exposure to R software (mostly in the context of statistical analysis) and have used CodeAcademy to further my understanding of Python. I did some image processing during my undergraduate studies as well, but am not particularly comfortable with that set of skills. I have used leaflet to embed my maps and create time series before, so that could be an option for this work.

WORKS CITED

Kavanaugh M. T., Hales B., Saraceno M., Spitz Y.H., White A. E., Letelier R. M. 2014. Hierarchical and dynamic seascapes: A quantitative framework for scaling pelagic biogeochemistry and ecology, Progress in Oceanography, Volume 120, Pages 291-304, ISSN 0079-6611, https://doi.org/10.1016/j.pocean.2013.10.013.

Sakuma, K., Lindley, S. 2017. Rockfish Recruitment and Ecosystem Assessment Cruise Report.  United States Department of Commerce: National Oceanic and Atmospheric Administration, National Marine Fisheries Service.

-Willem Klajbor, 2019

Seth Rothbard My Spatial Problem

A description of the research question that you are exploring

Of the 31 pathogens known to cause foodborne illness, Salmonella is estimated to contribute to the second highest number of illnesses, the most hospitalizations, and the highest number of deaths in the US when compared to other domestically acquired foodborne illnesses1. Salmonellosis is the bacterial illness caused by Salmonella infection. It is estimated there are approximately 1.2 million cases of salmonellosis and around 450 deaths every year in the US due to Salmonella1. Over time there has been marked variability in the number of reported cases per year. Salmonellosis is a mandatory reportable illness in Oregon and available information indicates that incidence rates of this disease have been stable since the new millennium2. The objective of this study is to perform spatial analysis of lab-confirmed Salmonella in Oregon counties for the years 2008-2017 for which county level data are available and determine whether some counties have a higher risk of Salmonella infection compared to others. I also wish to explore the socioeconomic factors associated with high incidence rate counties. My research question that I wish to explore is:How are spatial patterns of Salmonella related to spatial patterns of socioeconomic factors? Certain socioeconomic patterns such as lower levels of education and income may increase rates of Salmonella in these populations as a result of improperly preparing/cooking foods, less strict sanitation practices, and/or higher rates of eating high risk foods.

A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent

The Oregon Health Authority has created a database called the Oregon Public Health Epidemiology User System (ORPHEUS) as a repository for relevant exposure and geospatial data related to disease cases reported to public health departments all across the state. This database has been maintained by the state since 1989 and includes information regarding various diseases. The dataset I will be using is a collection of every single reported non-typhoidal Salmonella case within Oregon from 2008-2017. The distinction between typhoidal Salmonella and non-typhoidal is that the typhoidal variety of Salmonella causes typhoid fever while non-typhoidal Salmonella causes salmonellosis (a common gastrointestinal disease and a type of “food poisoning” as it is usually referred to). The spatial resolution of this data has been obscured to the county level to protect personal privacy and confidentiality. I will also be using data from the American Community Survey and the CDC’s Social Vulnerability Index. These datasets contain social vulnerability related variables for Oregon at the county level. In the case of the American Community Survey, data is available for the years 2009-2017 and the Social Vulnerability Index has data available for 2014 and 2016. Yearly county population estimates will also be used from Portland State University’s Population Research Center. Because of the high amounts of available data I will choose to start my exploratory analysis for Oregon in 2014 as all data is reported for that year.

Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

I expect counties with younger populations (higher proportions of infants and newborns) as well as counties with higher proportions of females to have higher adjusted incidences of Salmonella. Prior surveillance suggests that children under the age of 5 are at the highest risk for Salmonella infection likely due to their developing immune system and how they interact with their environment. Specifically, many young children do not/are unable to wash their hands prior to touching their mouths. Females are also known to have a higher risk of Salmonella infection, however the mechanism behind this is relatively unknown with some explanations suggesting that it is due to that females are more likely to have more interactions with young children. I also expect counties with lower Social Vulnerability scores to have higher rates of Salmonella infections. Higher rates of poverty and lower amounts of education are often associated with more negative health outcomes.

Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

I would like to calculate age and sex adjusted rates of disease for each county in Oregon. I am also interested in undertaking cluster analysis and calculate spatial autocorrelation among Oregon counties over time. Finally, I would like to perform a regression of county disease incidence rates by the different socio-economic factors found in the American Community Survey and Social Vulnerability Index. I would be interested in learning about spatial Poisson regression to assess which variables are significantly associated with the presence of disease. I would also be interested in learning about hotspot analysis to evaluate if there are areas of Oregon with significantly higher disease rates. Ideally, all of my analyses will be performed in R and ArcGIS.

Expected outcome: what do you want to produce — maps? statistical relationships? other?

I would like to produce choropleth maps of adjusted Salmonella infection rates as well as for hotspot analysis. I want to produce regression models to describe how incidence rates of Salmonella vary across different socioeconomic indicators. I also want to create graphs to describe spatial autocorrelation patterns as well as to show disease rates over time.

Significance. How is your spatial problem important to science? to resource managers?

This analysis will be helpful to identify county populations which are at higher risk for Salmonella infections. The inclusion of social vulnerability variables will be useful for state/local policy makers. Reforms can be proposed or further studied to assess how addressing the needs of particularly vulnerable populations will affect the incidence of Salmonella. This research will be beneficial for further public health research as trends found here may also hold true for other foodborne illness. The aim of this research is to benefit the health of communities in Oregon by highlighting the association between social vulnerability and the risk of foodborne illness.

Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software

I have no experience with Arc-Info, programming in Python, and image processing. I have some limited experience within Modelbuilder. I am very comfortable performing statistical analyses within R and have some experience using the software to create maps using various packages.

References

  1. Estimates of Foodborne Illness in the United States. Centers for Disease Control and Prevention. https://www.cdc.gov/foodborneburden/2011-foodborne-estimates.html#modalIdString_CDCTable_0. Published July 15, 2016. Accessed July 31, 2018.
  2. Oregon Health Authority. Salmonellosis 2016 Report. Oregon Public Health Division. Available at: https://www.oregon.gov/OHA/PH/DISEASESCONDITIONS/COMMUNICABLEDISEASE/DISEASESURVEILLANCEDATA/ANNUALREPORTS/Documents/2016/2016-Salmon.pdf. Accessed July 31, 2018.

Natural Resource Governance Perceptions and Environmental Restoration

Research Question

How is the spatial pattern of individuals perception of natural resource governance related to the spatial pattern of environmental restoration sites via distance and abundance of improved sites?

  

My Datasets

Puget Sound Partnership Environmental Outputs Data

The Puget Sound Partnership—a governmental monitoring entity—keeps records of environmental restoration projects throughout the Sound. There are GPS points for restoration site locations across their governing boundaries. I have downloaded the points, but I am still working on figuring out this dataset. There are over 12,000 entries, and many appear duplicative.

Puget Sound Partnership Social Data

I stratified a random sample (28% response, n= 2323) of the general public from the Puget Sound in Washington from one time period. They data are from a survey of subjective wellbeing related to natural environments. I am specifically examining the first block of seven questions related to perceptions of natural resource governance. These questions have been indexed into one perception score. Around 1770 individuals gave location data (cross street and zip code) which have been converted to GPS points. I also have demographic information for individuals.

 

Hypotheses

Based on current research, there is a significant correlation between environmental metrics and subjective wellbeing such as green space and air pollution (Diener, Oishi, and Tay 2018). I hypothesize that 1) shorter distances between individuals and restoration sites, and greater number of restoration sites near individuals, will correlate positively with governance perceptions, and 2) positive environmental outcomes will correlate positively with governance perceptions.

 

Approaches

I would like to test the statistical significance of distance from individual to restoration sites on governance perceptions, and test whether the number of sites within a radius moderates that relationship. I have previously created a plot of perception versus distance from other individuals, and perceptions are not spatially autocorrelated. To expand on this work, I would like to use spatial relationship modeling approaches, such as geographically weighted regression.

  

Expected outcome

I would like to produce statistical relationships between my dependent and independent variables. My dependent variables are good governance and life satisfaction (collected with demographic information). My independent variables are age, sex, race, area (self-indicated urban, suburban, or rural), years lived in the Puget Sound, political ideology (a proxy from voting precincts), income, education, number of restoration sites, and environmental improvement score.

I expect my relationships to be correlational and produce betas, p-values, and r2 values, which I will display as tables. The large volume of points (n = 1770 individuals & n = 12,000 restoration sites) I do not believe maps would provide visually relevant images. I already have maps of both perception points, and restoration points.

 

Significance

Incorporating aspects of subjective wellbeing and general public perspectives about natural resources into scientific assessment and decision-making processes could help managers improve human wellbeing and environmental outcomes simultaneously. The links between metrics of subjective wellbeing related to natural environments and metrics of ecosystem health have not been studied holistically. There are gaps in knowledge around understanding the connections among these systems. Research suggests that good governance plays an important role in improving wellbeing because governing systems provide goods and services that make people better off (Landman 2003). Current research, around good governance perceptions, has shown links to support for environmental improvement measures, but also shows individuals care less about environmental effectiveness of measures (Bennett et al. 2017). Research lacks knowledge in whether positive perceptions are linked to environmental conditions. To understand the connections between natural systems and subjective wellbeing, further research is needed that includes case studies that can illuminate general trends, as well as analyses that can show connections spatially (Milner‐Gulland et al. 2014).

 

Level of preparation

  • Arc-Info

I have taken one class that used ArcPro; GEOG 560.

  • Modelbuilder and/or GIS programming in Python

In GOEG 560 we completed one exercise that used Modelbuilder.

  • R

I have taken one class on R (FW 599), and have been using it actively for my own analyses for a few months, as well as taken GEOG 561, which primarily used R.

  • image processing

I took three digital photo classes using adobe photoshop and am very proficient in its use. I often use it to amend maps I make in Arc.

  • other relevant software

I do not believe I have expertise in any other relevant software.

 

Literature Cited

Bennett, Nathan J., Robin Roth, Sarah C. Klain, Kai Chan, Patrick Christie, Douglas A. Clark, Georgina Cullman, et al. 2017. “Conservation Social Science: Understanding and Integrating Human Dimensions to Improve Conservation.” Biological Conservation 205 (January): 93–108. https://doi.org/10.1016/j.biocon.2016.10.006.

Diener, Ed, Shigehiro Oishi, and Louis Tay. 2018. “Advances in Subjective Well-Being Research.” Nature Human Behaviour 2 (4): 253. https://doi.org/10.1038/s41562-018-0307-6.

Landman, Todd. 2003. “Map-Making and Analysis of the Main International Initiatives on Developing Indicators on Democracy and Good Governance.” Human Rights Centre University of Essex.

Milner‐Gulland, E. J., J. A. Mcgregor, M. Agarwala, G. Atkinson, P. Bevan, T. Clements, T. Daw, et al. 2014. “Accounting for the Impact of Conservation on Human Well-Being.” Conservation Biology 28 (5): 1160–66. https://doi.org/10.1111/cobi.12277.

 

Exploring spatial variation in drivers of soil CO2 efflux in HJ Andrews Forest

Description of Research Question

My objective is to capture modes of variance that exist between and within a subset of variables that I expect to correlate most strongly with soil CO2 efflux in the HJ Andrews (HJA) forest. By stratifying the forest, I plan to determine future sampling sites that will be used to explore the relationship between soil C inputs, soil C stocks and CO2 outputs. Aboveground and belowground biomass are major sources of soil carbon and drivers of soil respiration, so biomass will be used as a proxy for soil CO2 efflux for the purposes of this analysis.

Research Question: How is the spatial distribution of biomass in the HJA forest related to stand age, slope, aspect, elevation and geomorphon as a result of varying degrees of exposure to solar radiation, wind gusts, precipitation, humidity, etc.? What is the overall variance of the HJA forest along these vectors and is that variance spatially autocorrelated?

Description of Dataset

I will use LiDAR data from 2011 and 2014 at 0.3-0.5 m vertical resolution and at 1 m2 horizontal resolution covering the Lookout Creek Watershed and the Upper Blue River Watershed to the northern extent of HJA. These LiDAR data include a high hit model and a bare earth model. I will also use NAIP imagery to approximate forest stand age, which is 1 m resolution and covers years between 2002 and 2018.

Hypotheses

I expect areas containing more biomass to positively correlate with south-facing slopes due to more exposure to solar radiation resulting in faster rates of vegetation growth. I expect older stands to positively correlate with greater biomass. I expect steeper slopes to correspond to less biomass due to more weathering and thinner soil horizons, supporting less growth. I expect higher elevations to correspond to lower biomass due to greater sustained winds, higher windspeeds and more snow accumulation. As geomopohon describes the geometric structure of the terrain, it is a collection of multiple factors that could positively or negatively correlate with biomass. For example, I expect a ridge to negatively correlate with biomass because of the combination of greater slopes and more exposure to winds, while I expect a valley to positively correlate with biomass due to more wind protection and thicker soil horizons with more organic matter and more water retention.

Approaches

With an end goal of identifying sampling sites, I’ll need to cluster or stratify the HJA forest. I’ll begin with a clustering analysis, then perform supervised and unsupervised classification, followed by sensitivity analysis comparing the results of the clustering analysis and both classifications. I will need to address spatial autocorrelation either as part of the clustering analysis or separately. I’ll need to plan sampling by accessibility, so I’ll examine an HJA roads layer as well.

Expected Outcome

I plan to produce maps (or use/improve on already available ones) at a spatial scale relevant to my study so I can identify potential sampling sites. I plan to map biomass across the HJA forest and produce a stratification of factors most closely related to soil CO2 efflux. Depending on resolution of the data and the results of the stratification, I may need to constrain my analysis. I plan to produce a statistical summary of the strata relating and describing the covariates and how much variance is explained by the stratification.

Significance

Carbon sequestration is a highly relevant research area where many unknowns still exist. Given that soil is an enormous C reservoir, small changes in soil C stocks can have huge impacts on the rest of the C cycle. As CO2 is a potent greenhouse gas, more release of C from soil can cause greater warming of our planet and can lead to a positive feedback loop where the warming cycle is amplified. It is in our best interest to have a good understanding of current soil C stocks and fluxes in forested systems so we can hypothesize how they might change under different or future conditions. By using biomass as a proxy for soil CO2 efflux, I will identify locations that are likely to have greater CO2 efflux and I will be able to make informed predictions about which drivers are most significantly correlated to CO2 efflux. I will be able to test these analyses in the future using field sampling techniques.

My level of preparation/proficiency

I have limited experience with Arc-Info (GEOG 560) and I’ve used Modelbuilder a few times. I have no experience with GIS programming in Python or R, but am proficient with coding in R (3 statistics courses and my own data analysis) and am comfortable seeking answers to questions in the R environment. I have no experience image processing.

What is rural? Creation and comparison of health disparity-inclined rural indices in Texas

A description of the research question that you are exploring.

I am exploring rural classification of counties in the state of Texas by creating two rural indices and comparing them to one another to determine the effects of specific weighted measures on rurality index score. One of the indices will contain basic rural indicator variables, while the other will contain the basic variables plus more complex indicators of rurality. Specifically, I would like to compare the indicator variables to one another to see how much each contributes to an overall rurality score in Texas

A description of the dataset you will be analyzing, including the spatial and temporal resolution and extent.

Various rural indicator variables will need to be obtained before combining them into indices. Previous geographical research has indicated that a variety of measures can indicate how rural or urban an area is. Some of these measures include population density, ethnic diversity, land use, household income, road density, percent of population with health insurance, and more. For the majority of these indicator variables, the sources will be basic 2010 US census data, 2014 US census TIGER/Line data, and 2011 national land cover database (NLCD) data. Basic US census data exists at the census block and census tract level in polygons, while NLCD data exists at 30m by 30m spatial resolution in raster grid form and TIGER/Line data exists at 1km spatial resolution in raster grid form.

Hypotheses: predict the kinds of patterns you expect to see in your data, and the processes that produce or respond to these patterns.

I expect there will be significant differences in rurality index score for Texas counties when comparing a basic rural index containing only population density, income, and land use to a more complex index that also measures rural/urban status via diversity, percent uninsured, and road density. Rural areas in comparison to urban areas commonly have lower healthcare access, lower average socioeconomic status, and have a higher percent Caucasian population than urban areas, so these variables could be indicative of what constitutes rural and urban. I also expect specific variables to contribute significantly more to rurality than others. For example, population density is likely to have high contribution to rurality. More concisely, I expect the spatial and statistical pattern of rurality in Texas will become more dispersed and even across the state when including health-related variables because of the increased multidimensionality and contextual factors these variables will provide.

Approaches: describe the kinds of analyses you ideally would like to undertake and learn about this term, using your data.

I am planning to use various methods to convert the census block/tract and raster grid indicator variables to county data; likely via zonal statistics or other similar methods in ArcGIS. I would also like to use statistical weighting procedures to create both the “basic” and “complex” rural indices. Some weighting procedures I have heard of that could work for this include principle component analysis and factor analysis. A PCA procedure specifically could be used because of its robust ability to produce indices that are weighted via proportion of variance that can be attributed to each variable in the measurement of rurality.

Expected outcome: what do you want to produce — maps? statistical relationships? other?

I would like to create maps of Texas comparing county rural index scores for the two indices for visual comparison. In addition, I would like to statistically compare the two indices and determine which specific indicator variables attributed most to the differences in county rurality scores between the two indices.

Significance. How is your spatial problem important to science? to resource managers?

This spatial problem is significant because rural/urban classification is inconsistent in rural health disparities research and is commonly an after-thought in comparison to the health outcome being studied. Existing measures of rurality were not created for health disparities research and are instead most useful for bureaucratic and economic purposes (Meilleur et al., 2013). This research will improve the classification methods for rurality by introducing a more scientific and health research-inclined method. Further, this research statistically compares specific indicator variables within indices to determine those that are most significant for rural classification.

Your level of preparation: how much experience do you have with (a) Arc-Info, (b) Modelbuilder and/or GIS programming in Python, (c) R, (d) image processing, (e) other relevant software

I am an intermediate user of ArcGIS but have little experience in modelbuilder and Python GIS programming. I am proficient in statistical programming in R, an intermediate user of ENVI for image processing, and have also used R for spatial analysis.

 

References

Meilleur, A., Subramanian, S. V., Plascak, J. J., Fisher, J. L., Paskett, E. D., & Lamont, E. B. (2013). Rural residence and cancer outcomes in the United States: issues and challenges. Cancer Epidemiology and Prevention Biomarkers, 22(10), 1657–1667.