My spatial problem deals mainly with determining what scale best fits both birth registry point data in Texas (4.7 million births from 1996-2002) and available spatial resolution of MODIS aerosol optical thickness data. A smaller cell size will more accurately determine ambient particulate matter exposure levels, but may leave too many cells with 0 births at different spatial scales. Increases in the cell size will allow a better coverage of the state, but may limit the spatial statistical relationships of low birth weight rates and accuracy of ambient air pollution exposures. A model will need to be created to combine ground-based air monitor exposure levels and satellite data to accurately determine rural particulate matter exposures.

A temporal problem deals with creating a model that will determine ambient air pollution exposure levels in each cell during different known susceptibility windows during all 4.7 million pregnancies. An analysis will need to be done to determine the variability of particulate matter levels on multiple time scales and incorporate the best fit with pregnancy susceptibility windows.

A combination of the spatial analysis and temporal analysis will incorporate both time lags and spatial clustering. This aspect of the project should be relatively straightforward. The goal of this section will aim to determine whether 1) LBW are clustered in space and time, or 2) whether individual emitters (using EPA TRI data set) are spatially and temporally correlated with LBW.

Below are some examples of different cell size and temporal scales.

1. 2008-2009 LBW hot spot analysis based on Texas census tracts

0809-hotspotanalysiscensustract

2. A hot spot analysis using .1x.1 degree (roughly 10km x 10km) grids of 2008-2009 of LBW rates

0809-hotspotanalysispoint1degreesquare

3. A hot spot analysis using 1×1 degree (roughly 100km x 100km) grids of 1996-2009 LBW rates.

allbirths-hotspotanalysis1degreesquare

 

Basic methods:

This experiment was designed to quantify the distribution and abundance of invasive lionfish and then determine whether the distribution and abundance of various native prey species is correlated with lionfish distribution and abundance. Eight reefs were selected and assigned low or high lionfish density. Discrete plots and/or transects were establish on the eight reefs and the local distribution of invasive lionfish and select prey species was monitored for ten weeks.  Automated video cameras were also deployed on the reefs to capture the movement of fish across the reef. Behavioral observations were made on each of the high lionfish density reefs at dawn, midday, and dusk, to record lionfish movement and behavior on the reef. Habitat structure of the reef was measure along with rugosity in all the plots and transects.

Here is an example of one of the high lionfish reefs. The star represents the potential hotspot for lionfish presence.  Right now the plots and transects are only representations of the actual spatial distribution.

Reef

These are examples of lionfish location on different surveys.

PTVO1

PTVO2

TO DO NEXT?:

For each parameter, list an explicit prediction from the hypothesis that lionfish live and forage mostly within hotspots on reefs.  The way to envision a prediction is:  “If the hypothesis is true, then…”.  Example predictions:

(1) There should be a significant correlation between the rankings of plots within a reef based on distance from the hotspot (1-9) vs. based on mean distance to lionfish through time.

(2a) Lionfish should spend more time in plots close to the hotspot than in plots further away.

(2b) Lionfish paths of movement should be close to the hotspot.

 

Specific tasks for each example prediction:

(1) Run Kendall’s tau rank correlation analysis?? of plots within a reef based on distance to hotspot vs. mean distance to lionfish through time.

(2a) Calculate lionfish time per plot from time budgets.

(2b) Map lionfish paths of movement from time budgets (eventually analyze distance of path from hotspot).

 

I don’t think I need to/can use Arc to get this done. I have GPS points for the center of each reef, as well as measurements for the reefs (length, width, surface area, circumference etc).

I need to be able to calculate distances and creates paths of movement and then calculate distances traveled. It only needs to be related to within in reef scale. I am unfamiliar with any sort of programs that can do stuff like this, however I don’t think it would be to hard once I figure out which one to use.

I spent two full years of my life tromping through wilderness, sacrificing life and limb for the most complete data sets humanly possible. We covered miles of remote beaches on foot to install geodetic control monuments and take beach profiles, separated by as little as 10 meter spacing. We brethlessly operated our pricey new boat in less than one meter of water to collect just one more line of multibeam sonar bathymetric data or to get the right angle to see a dock at the end of an inlet with our mobile LiDAR. One of the most trying, and perhaps most dangerous tasks undertaken by our four-person team was the installation of large plywood targets before LiDAR scans. Boat based LiDAR is not yet a commonly employed data collection method, and our team has been executing foot-based GPS surveys for years. We were dead set on ground truthing our new “high-accuracy” toys before we decided to trust them entirely.

A co-worker created large, plywood targets of varying configurations: black and white crosses, X’s, circles, targets, and checker boards. We tested them all, and determined the checker board to show up best after processing the intensity of the returns from a dry dock scan. For the next 12 months, we hiked dozens of these 60 centimeter square plywood nightmares all over the Olympic Peninsula for every scan, placing them at the edge of 100 meter cliffs, then hiking to the bottom to be sure we had even spacing at all elevations. After placing each target (using levels and sledges), we took multiple GPS points of its center to compare with spatial data obtained LiDAR. We collected so much data, other research groups were worried about our sanity.

Then, we finally sat down to look for these targets in the miles and miles of bluff and beach topography collected. Perhaps you already know what’s coming? The targets were completely impossible to find; generously, we could see about one of every ten targets placed. Imagine our devastation (or that of the co-worker who had done most of the hiking and target building).

So the spatial question is rather basic: where are my targets?

I hope to answer the question with a few different LiDAR data sets currently at my disposal. The first is a full LiDAR scan of Wing Point on Bainbridge Island, WA. It’s one of the smaller scans, covering only a few miles of shoreline. Deeper water near the shoreline allowed the boat to come closer to shore, and the data density is expected to be high. We hope to find a few targets, and have GPS data corresponding to their locations. Currently, the file is about 5 times the size recommended by Arc for processing in ArcMap. On first attempts, it will not open in the program. While dividing the file would be easy with the proprietary software used with the LiDAR, I’d like to figure out how to do that with our tools. This will be one of the first mountains to climb.

The second data set is a more recent target test scan. Since my departure and determining the frustrating reality of the plywood targets, the group has found some retired Department of Transportation (DOT) signs. They have used gorilla tape and spray paint to create target patterns, similar to the test done with the original batch. I’ve been given one line of a scan of these new target hopefuls. My goal here is to ascertain the abilities of ArcMap for processing target data and aligning it with GPS points, without the added trials of trying to find the darn targets. Of course, I’m already hitting blocks with this process, as well. Primarily, finding the targets requires intensity analysis. Intensities should be included in the .LAS file I’m opening in ArcMap, but they are not currently revealing themselves. My expectation is that this is related to my inexperience with LiDAR in ArcMap, but that remains to be seen.

PGB_Target_Test_pano

Writing this post, I’m realizing that my link to spatial statistics currently seems far in the future. Just viewing the data is going to be a challenge, since the whole process is so new to me. The processing will hopefully result in an error analysis of the resulting target positions, when compared to the confidence of ground collected points. Furthermore, the Wing Point data was taken for FEMA flood control maps, and that sort of hazard map could be constructed once rasters or DEMs are created.

A large part of me is horrified by how much I’ve taken on, deciding to figure out how to use ArcMap for LiDAR processing when my experience with the program is already rather primitive. However, I’m excited to be learning something helpful and somewhat innovative, not to mention helpful to the group for whom I spent so many hours placing targets.

 

My research is focused on developing a web-based forage species selection tool and estimating potential yield of key forage species grown in Oregon and Sichuan Province.

Our goal is to match appropriate species with each eco-region. Related to this class, the problem is how we can use the GIS spatial analyst tools to define and display a workable number of forage production eco-regions based on topography, climate, soil characteristics, land-use and land-cover, and agricultural systems.

Although there have been several important studies directed at defining ecoregions (Bailey, 1996; Thorson, 2003; Omernick, 2004), these have been based primarily on the Köppen Climate Classification System (1936) and broad groupings of species suitable for each zone. They are not helpful in quantifying potential annual forage yield or seasonal production profiles required for rational management of forage-livestock systems.

To provide useful guidance to Oregon and Sichuan Province farmers and ranchers, our agroecozone classification systems will use a hierarchical approach beginning with climate, with modifications due to physiography and land-use/land-cover, and soil characteristics.

Level I: Climate (Thermal Units and Precipitation)

Climate was chosen as the foundational level of the classification system due to the essential nature of temperature and precipitation in plant growth and development. Base spatial layers for climate factors will include extreme monthly cold and hot temperature, mean monthly maximum and minimum temperature, mean annual temperature, and mean annual, seasonal, and monthly precipitation. Climate-based indices will be developed to predict forage crop growth and development. These will include solar radiation and photosynthetically active radiation, accumulated thermal units (with various base temperatures), growing season length, and vernalization chilling days. For agricultural systems that include irrigation a soil water balance model will be applied.

Level II: Physiography, Land-use/Land-cover [Topography (DEM), MODIS Images]

The second level of the classification systems will involve physiography and land-use and land-cover. A DEM will be used to underlay the climate layers and identify terrain slope, with the following rankings: >60°, not useful; 60°— 50°, 30% can be useful for livestock; 50°— 40°50% can be useful; 40°— 30°, can be used as pasture; and >30°, useful as grassland (Zhang, Grassland management, 2010). Current land-use and land-cover will be characterized from current and historical MODIS satellite images, with particular focus on cropland, pastureland, and rangeland areas.

Level III: Soil Characteristics (pH, Drainage, Salinity)

Soil characteristics will be the final level of the hierarchy, since, to a large degree, these can be modified in agricultural systems. Spatial data layers will be obtained for soil type, pH, drainage classification, and salinity. Site specific data will be obtained for more detailed fertility information.

As I started to describe in class, my project will be dealing with output results from the model software ENVISION.  ENVISION is a GIS-based tool for scenario based community and regional planning, and environmental assessments.  It combines a spatially explicit polygon-based representation of a landscape (IDUs or Individual Decision Units in my case), a set of application-defined policies, landscape change models, and models of ecological, social, and economic services to simulate land use change and provide decision-makers, planners, and the public with information about resulting effects.

The ENVISION project I am involved with is centered on Tillamook County and its coastal communities. Through a series of stakeholder meetings (which have included a range of people such as private landowners and state land use commissioners) our group identified several land use policies to implement in the ENVISION model. The policies were then grouped into three types of management responses: the baseline (or status quo), ReAlign, and two types of Hold the Line (high vs. low management) scenarios. These policy scenarios have been combined with current physical parameters of the coastline such as dune height and beach width, and will be also linked with climate change conditions at low, medium, and high levels for the next 30, 50, and 100 years.

Since ENVISION is GIS-based already, I am having a tough time coming up with a problem that complements the project in ArcGIS.  ENVISION does a great job of visualizing the changes expected for each location along the coast via videos, graphs (see below), and can even include economic estimations.

County Wide

Therefore, it may be best to explore the capabilities of software like R to analyze the output data. One idea would be to calculate the probability of occurrence for these different events and total number of occurrence.  I need to take a deeper look into how these events are calculated to begin with, and determine the inherent estimates of probability and uncertainty.  This type of analysis would help determine whether this type of exercise is beneficial for stakeholders and would help answer their own questions of trust in the results.

Another idea would be to focus on specific results from ENVISION and try to determine exactly how one policy is affecting the coastline and creating such disparate results. For instance, the graph below shows Numbers of Flooded and Eroded Structures in Pacific City under three types of scenarios. What is causing the large number of eroded/ flooded structures between 2035 and 2040? Why is there such a small difference between ReAlign and Hold the Line strategies if they are employing such different options? Some of these questions may be answered with a greater understanding of ENVISION, however, these are the types of questions that may be asked by stakeholders and it would be prudent to provide more quantitative answers that ArcGIS or R could glean.

 Pacific City

My initial goal was to explore local food production over time near Corvallis, but I am getting ready to change topics because I cannot find enough information on farms in the area to discriminate crop types, either by visual assessment or ownership.  The federal data I could find on crop types did not list information more granular than the county level.  Land cover data categorizes farmland as “herbaceous” or “barren” and is not much help.  So I attempted visual assessment of orthographic imagery.  Here is the Willamette Valley around Corvallis:

wv1

If I zoom in on a parcel, this is the level of detail:

wv2

Clearly agricultural, but I couldn’t tell you what.  That was 2011, here is the same land in 2005:

wv3

Is that supposed to be grass?  What degree of certainty do I have?  Not enough for analysis.

Here is the adjacent parcel:

wv4

Clearly two different crop types, but is one hay and the other grass seed?  Don’t ask a city slicker like me.

The second strategy I tried was to determine ownership.  Certain farms produce specific types of crops, and other farms have a reputation for selling their food locally.  But I could not find the equivalent of the White and Yellow pages for GIS, or even a shapefile with polygons divided by tax lots.  Instead, I tried looking at the water rights.  Water rights data identifies the owner in a set of point data, and also displays a polygon associated with each right, showing the area of farmland using that right.  I selected only water rights related to agriculture, so municipal and industrial water rights would not show up in the layer.  Here is a closeup of water rights data layered on top of the orthographic data:

wv5

The water right for the parcel in the center on the right belongs to Donald G Hector for the purpose of irrigation.  An article in the Gazette-Times records the passing of Donald’s wife in 2004 from Alzheimer’s after being married to Donald for 53 years.  Businessprofiles.com lists the farm as currently inactive.  Other than that, I could not find much about Mr. Hector or his farm.

There is a more significant problem with using water rights data to determine farm ownership, which you might intuit from the picture above.  There are many parcels of land that are not associated with water rights.  In fact, only around 15% of Oregon’s crops are irrigated crops.  Once I zoom out, this becomes obvious:

wv6

The large blue area at the bottom left is the Greenberry Irrigation District, meaning a utility maintains the local irrigation infrastructure, and taxes farmers individually.

When I was interning at the WSDA, they had enough data to construct a map of the kind of information I want, but they could not publicize it because of privacy concerns, and I think that is the problem I am running into here.  I need some NSA style access.

Or a new spatial problem!

 

 

 

For my spatial problem I will examine the role of spatial autocorrelation and seasonality in developing a land use regression (LUR) model. In particular I am interested in optimizing the incorporation of spatial autocorrelation and seasonality for prediction of air pollution in the City of Eugene.

For those unfamiliar with a LUR, it essentially combines GIS variables that are predictive of air pollution concentrations along with actual air pollution measurements in order to predict air pollution at unmonitored locations using ordinary least squares (OLS) regression. The problem with a typical LUR model is that they don’t account for spatial autocorrelation. The value of accounting for spatial autocorrelation is due to the fact that spatially based data, such as air pollution, is typically spatially correlated.

This past quarter in my GEO580 course I developed a LUR that did account for spatial autocorrelation by modeling the covariance of air pollutant concentrations of adjacent zip code boundaries, using a spatial CAR model. For this class I wish to develop this idea even further by using multiple techniques, namely geographically weighted regression (GWR), a spatial CAR model, and OLS to compare the model results to actual air pollution measurements. This work will require me to use both ArcGIS spatial analyst toolbox and the R statistical software.

As mentioned above, I am interested in including seasonal trends in air pollutant variation in order to see if inclusion of seasonal variation is capable of improving model estimates. To do this I propose to incorporate seasonal ratios to annual ratios of air pollutant concentrations.

To keep this work focused I will use data on just one air pollutant, as opposed to last quarter wherein I developed a LUR for seven different pollutants. By focusing on just one pollutant I hope to keep the work efficient and effective toward achieving my goals in this class. Ideally, this work will help to inform my dissertation proposal work.