- Question that you asked
In analyzing the distribution of settlement boundaries that I obtained from OpenStreetMap, I wanted to know the general clustering and regionality of settlements in order to understand how other spatial statistics that I perform in my next step will behave based on the results from this explanatory step. The question I’m introducing for Exercise A centers around how similar or different nearby settlements are to each other: is there a regionality in settlement sizes? Some future questions that I’m considering are whether or not clustered settlements have higher detection in the World Settlement Footprint classification (which I’m using instead of GHSL because it’s a more localized classification). This would be determined using the percent of OSM settlement area that was detected by WSF.
- Name of the tool or approach that you used.
I decided to use spatial autocorrelation to answer this question, since the essence of my question centers around what the pattern of my data looks like, how clustered or not clustered are these settlements, and how that might affect my future analysis and considerations.
For this, I employed the Spatial Autocorrelation tool in ArcGIS Pro, which uses Global Moran I’s algorithm.
- Brief description of steps you followed to complete the analysis.
In order to identify the refugee settlements, I went through multiple rounds of querying with OpenStreetMap to extract the boundaries that are directly related to refugee camp boundaries. Hannah Friedrich and I have worked together on defining some of these boundaries, and she took some time to delineate boundaries for a separate study of hers. While I will incorporate these delineated ‘regions of interest,’ I will most likely not move forward with them in further analyses because it would present an issue with scaling up and manual interpretation. Below I list the three polygon data I’m analyzing here.
OSM Boundaries: This is a result of multiple query efforts within Overpass Turbo, a online server for downloading OpenStreetMap data. Various queries on attribute tags were performed in order to select relevant refugee boundary data.
Regions of Interest: This is a result from Hannah’s efforts of limiting areas within the OSM Boundaries that appear to be built up using high resolution imagery available on Google Satellite.
Selected OSM Boundaries: This is a selection from OSM Boundaries that merges polygons of the same settlement into multipolygons; this also a selection of boundary polygons that are less than 2000 hectares to account for some boundaries that are designated versus actually settled.
I then ingested these boundaries into Google Earth Engine, extracted the pixel areas for the WSF pixels within the three different boundary layers, and re-exported these.
I then imported these layers into ArcGIS Pro and performed the Spatial Autocorrelation tool and experimented with multiple parameters.
I ultimately chose the “row standardization” option given the possible bias in ‘sampling’ design, since this creation of this data is most often from the HOT Uganda team and resources might limit where they can travel and collect data.
- Brief description of results you obtained.
Ultimately, my results showed clustering within the OSM-identified settlements, which is to be expected. There are a variety of questions that arise from this that might confound my analysis that are still moderately troubling: the method of data recording, the spotlight effect, the presence of organizations able to record data, and the inherent clustering based on human behavior, or drivers such as conflict or environmental conditions. This initial analysis is really to understand the degree of clustering that I might expect in my future analyses – if there is high clustering, then I need to understand that clustering when I’m testing additional questions with my next exercise, like nearest road or nearest urban area. The images below demonstrates the result of an analysis settlement area in the “Selected OSM Boundaries,” “OSM Boundaries,” and “Regions of Interest.”
While OSM Boundaries and Selection from OSM Boundaries show high confidence of clustering, the “Regions of Interest” pattern shows a less confident result of clustering. This makes sense, given that both the OSM Selection and OSM Boundaries both have overlapping polygons and varying sizes.
Below is a map illustrating the distribution of refugee areas in northwest Uganda, where most of the settlements are concentrated.
- Critique of the method – what was useful, what was not?
Given my shortcomings with understanding statistical analyses, the interpretation of the results was most difficult for me. While my patterns appeared mostly clustered, the z-score and Moran’s Index showed changes when different parameters of comparison were chosen. Ultimately, I’m not sure if there would be a more effective spatial analysis to analyze aspects of these settlements that would prove more helpful in figuring out relevant information for moving forward with Exercise 2 and beyond.
Anna, this is a good step forward. Your Moran’s I analysis is of limited value when you have few polygons, as in the case of the green (and maybe blue) polygons, where you can visually assess that these few polygons are clustered. It is of more interest for the purple polygons, because there are many of them; here the Moran’s I analysis confirms what you can assess visually, which is that there are lots of little purple dots close together. Going forward for Exercises 2 and 3, it would be helpful to use a dataset with a larger number of points (i.e., the blue or purple datasets). Alternatively if you use the green polygons you would need to attribute them with population density so that there is some kind of spatial pattern to analyze. Thus, in Exercise 2, you would ask how the spatial pattern of people in refugee camps is related to the spatial distribution of surrounding resources (e.g., wood from forests) and/or the access to aid (i.e., proximity to roads). I look forward to seeing your next steps.