The following is the abstract of the paper I presented earlier this month at AAG:
The specific geography of individual wine growing regions has long been understood to be a significant factor in predicting both a region’s success in producing high quality grapes, and the resulting demand for wines produced from that region’s fruit. In the American wine industry, American Viticultural Areas (AVAs) are increasingly being used to designate a uniqueness and specificity of place. This process is often predicated on the argument that these areas represent a certain degree of physiographic uniformity or homogeneity. This is particularly the case with regard to the phenomenon of sub-AVAs, wherein smaller areas within large, spatially heterogeneous AVAs seek to differentiate themselves based on the physiographic features that are purportedly unique to those smaller subregions. In many cases, there is a strong correlation between soil classes and AVA boundaries, whereas in other cases the correlation is not as strong. This suggests that there are factors other than physiographic homogeneity contributing to the designation of these sub-AVAs. This study employs GIS and spatial analysis to examine and potentially correlate the soil classes of Oregon’s northern Willamette Valley with the sub-AVAs in that area. In doing so, this study presents maps and statistical results in order to provide a quantitative summary of the geographic context of vineyards in this region with respect to both the soil classes present and the federally designated AVA boundaries in which they are located.
About my data and my spatial problem:
The data set that I am working with is a legacy National Resources Conservation Service (NRCS) data set detailing soil classes throughout Oregon’s Willamette Valley. Using meets and bounds descriptions provided by the United States Department of the Treasury’s Alcohol and Tobacco Tax and Trade Bureau (TTB), the Federal entity tasked with approving AVA designation petitions, I have generated a series of polygons representing the Willamette Valley AVAs (Willamette Valley and its 6 sub-AVAs: Chehalem Mountains, Ribbon Ridge, Dundee Hills, Yamhill-Carlton, McMinnville, and Eola-Amity Hills). I also have a handful of raster data layers (slope, aspect, landform, lithology, and PRISM) that I am using to calculate zonal statistics. Many spatial statistical methods are designed around the use of point data – this poses a problem for me because all of my data is in either a vector polygon or raster format. I am interested in exploring which methods/tools within the Spatial Statistics toolbox are most appropriate for using with my data. I am also interested in getting feedback from others in this course so as to make my research more robust, defensible, and statistically sound.
-Doug
Hi Doug,
I also have concerns using polygons within the spatial statistics toolbox that is catered toward point data. I’m wondering if you’ve tried to convert your polygon features to points and then tried to run statistics again? This might be interesting to do with something like hot spot analysis to examine the different outputs based on polygon vs. point features.
Kate
Hi Doug,
With regard to polygons, while you can define relationships based on polygon contiguity, in the end polygon features are represented as points when analyzed by the Spatial Statistics tools.
Some comments (in no particular order)…
1) One way you might analyze wine growing areas is to model something like the amount of wine produced or value of wine produced per unit area. This would be regression analysis and my guess is that it would be very difficult to find a properly specified model… but it is definitely worth a try if you have lots of potential explanatory variable, include as many spatial variables as you can (like distance to …?? ), and use the Exploratory Regression tool in ArcGIS 10.1. You indicate you have polygons, rasters, and possibly points. For regression analysis you will need to get all of your variables into a single feature class with an identical geometry… suggestions next.
2) For cases where you need to get your environmental variables into a single geometry, consider defining/identifying the geometry that is most relevant to the question you are asking (1 mile fishnet grid polygons, soil polygons, census tracts <– not likely, or ??). Then for polygon data use Areal Interpolation (ArcGIS 10.1) to convert the polygon data to a surface and then to the ideal polygon geometry. Use zonal stats or sampling to convert the raster data to the ideal geometry.
3) If you don't find a properly specified model, another interesting analysis might be Grouping Analysis (ArcGIS 10.1). This multivariate tool can help you classify wine regions based on a number of attributes. The tool works best with numeric attributes (rather than nominal data like soil type) … however you can include nominal data by creating dummy variables or (better) by creating a rank for each soil type based on wine productivity (the higher the rank, the better that soil type is for growing wine).
4) At present we don't have an exploratory grouping analysis tool that helps you find the very BEST variable combination … sooooo find the right combination of variables corresponding to the ideal number of groups is a trial and error endeavor. Some strategies:
* use the Variable Significance information from Exploratory Regression (dependent variable would be amount of wine or value of wine produced) to identify the key variables to consider for grouping analysis.
* start with a few variables and add… look at the R2 values to determine which variables are most effective at differentiating areas (where the areas are the ideal polygons you've created).
* ask the tool to tell you the optimal number of groups (check ON the last parameter).
* you will likely want spatially constrained groups ?? … I like the K nearest neighbors method… play with different numbers of neighbors (the default is 8, but you can relax the continguity constraint by increasing this value).
* If you run in the foreground (Geoprocessing Options … uncheck Enable Background Processing) you get good information while the tool is running (you may decide not to create a report for every run during the trial and error part … the report takes a bit of time to create).
5) Okay, so I really like wine, but don't know even the first thing about wine growing… please forgive my ignorance here… If different areas focus on specific types of wine, a very simple analysis would be to create standard deviational ellipses for several wine varieties… use a weight field reflecting the quanity (not dollar amount)… this will show you the spatial distributions of each variety and where they overlap. This would answer questions like: are there discernible wine variety regions? Where do the varieties overlap and where are varieties spatially exclusive.
6) Just thought of another idea: quantifying the "biodiversity" of wine. If you can overlay your study area with a fishnet polygon grid and can identify for each grid cell the number (count) of different varieties of wine grown there, you can create a diversity map (hot spot analysis)… some places will produce lots of different varieties … others will produce very few… it would be interesting to compare diversity to amount produced or value of wine produced ??
I hope this is helpful!
Best wishes!
Lauren