1 Research Question

Counties reacted differently towards the Great Recession (officially started from Dec.2007 to June 2009). Economic resilience is defined as the regional capacity “to absorb and resist shocks as well as to recover from them” (Han and Goetz, 2015). There are two dimensions of economic resilience, resistance and recovery. This research focuses on resistance.

The research question is what factors are associated with the resistance to the 2008 recession at the county level in the US? The variable drop developed Han and Goetz (2015) is used to measure the resistance (actually vulnerability), which is the amount of impulse that a county experiences from a shock (the percentage of deviation of the actual employment from the expected employment during the Great Recession.).

droprebound

Figure 1 Regional economic change from a major shock and concepts of drop and rebound

dropequation

Rising income inequality is considered as one structural cause of the crisis (Raghuram, 2012). A rising trend in income inequality is observed in Figure 2. Therefore prerecession rising income inequality is hypothesized to increase drop (reciprocal of resistance).

income inequality

Figure 2 Income inequality in the U.S., 1967-2014 Source: Data from U.S. Census Bureau 2014a, 2014b.

There are two values for four of the income inequality measures and income distribution indicator at year 2013. Because a portion of the 2014 CPS ASEC, about 30,000 addresses of the 98,000 addresses, received redesigned questions for income and health insurance coverage. The value based on this portion denoted 2013a. While the remaining 68,000 addresses received the income questions similar to questions used in the 2013 CPS ASEC. This portion is labeled 2013b (U.S. Census Bureau,2016a, 2016b).

Spatial autocorrelation is hypothesized to act in the relationship. The effect of income inequality or drop may not be limited within a region but attenuate with distance. Drop in a county might be affected by its own characteristics as well as the surrounding counties (spatial lag and spatial error model).

 

2 Description of the dataset:

Income inequality: the Gini coefficient

Income distribution: poverty rate and the share of aggregate income held by households earning $200,000 or more

Control variables: Population growth rate from 2001-2005, % Black or African & American (2000), % Hispanic or Latino (2000)

Capital Stock variables:

Human Capital: % population with Bachelor’s degree or higher, age group (20-29, 30-49), female civilian labor force participation rate (2000)

Natural Capital: natural amenity scale (1999)

Social Capital: social capital index (2005)

Built Capital(2000): median housing value (2000)

Financial Capital (2000): share of dividends, interest and rent(2000)

Economic structure:

Employment share of 20 two-digit NAICS industries (manufacturing, construction, etc. other services (except public administration) omitted)

 

Table 1 Summary Statistics
Variables Obs Mean Std. Dev. Min Max Description Data source
Drop, rebound and resilience
drop 2,839 0.186 0.116 -0.236 0.895   Wu estimates based on 2003-2014 BLS QCEW
Income Distribution
Gini coefficient 3,057 0.434 0.037 0.333 0.605   Census 2000 P052
Poverty rate 3,055 0.142 0.065 0.000 0.569   Census 2000 P087
% Aggregate income held by HH earning 200K or more 3,141 0.091 0.050 0.000 0.456   Census 2000 P054
Control Variables  
Population growth rate 2001 -2005 3,055 0.020 0.056 -0.203 0.428   BEA 2001-2005 CA5N
%Black or African American 3,055 0.087 0.145 0.000 0.865   Census 2000 SF1 QT-P3
%Hispanic or Latino 3,055 0.063 0.121 0.001 0.975  
Capital stocks  
% persons with Bachelor’s degree or higher 3,055 0.164 0.076 0.049 0.605 Human Capital Census 2000 SF3 P037
%Total: 20 to 29 years 3,055 0.118 0.033 0.030 0.346 Census 2000 P012
%Total: 30 to 49 years 3,055 0.288 0.026 0.158 0.420
%Female Civilian Labor Force Participation 3,055 0.547 0.065 0.266 0.809 Census 2000 SF3 QT-P24
Natural amenity scale 3,055 0.056 2.296 -6.400 11.170 Natural capital 1999 USDA- ERS Natural Amenity Index
Social capital index 2005 3,055 -0.004 1.390 -3.904 14.379 Social capital Rupasingha, Goetz and Freshwater, 2006
Median value  All owner-occupied housing units 3,055 80495.190 41744.080 12500 583500 Built capital Census 2000 H085
Share of dividend, interest and rent 3,110 0.188 0.053 0.059 0.561 Financial capital 2000 BEA CA5
Economic Structure  
Farm employment 3,107 0.090 0.087 0 0.627   BEA 2001 CA25N
Forestry, fishing, and related activities 3,107 0.006 0.016 0 0.232  
Mining, quarrying, and oil and gas extraction 3,107 0.010 0.033 0 0.839  
Utilities 3,107 0.002 0.006 0 0.200  
Construction 3,107 0.058 0.032 0 0.260  
Manufacturing 3,107 0.109 0.089 0 0.558  
Wholesale trade 3,107 0.023 0.019 0 0.239  
Retail trade 3,107 0.109 0.030 0 0.372  
Transportation and warehousing 3,107 0.021 0.023 0 0.262  
Information 3,107 0.011 0.010 0 0.137  
Finance and insurance 3,107 0.027 0.017 0 0.201  
Real estate and rental and leasing 3,107 0.022 0.016 0 0.135  
Professional, scientific, and technical services 3,107 0.025 0.028 0 0.828  
Management of companies and enterprises 3,107 0.003 0.006 0 0.147  
Administrative and support and waste management and remediation services 3,107 0.026 0.024 0 0.192  
Educational services 3,107 0.006 0.010 0 0.112  
Health care and social assistance 3,107 0.052 0.048 0 0.305  
Arts, entertainment, and recreation 3,107 0.012 0.018 0 0.726  
Accommodation and food services 3,107 0.048 0.038 0 0.386  
  3,107 0.054 0.020 0 0.162  
Government and government enterprises 3,107 0.170 0.073 0.026 0.888  
Note: ACS- American Community Survey, BEA- Bureau of Economic Analysis, BLS- Bureau of Labor Statistics, ERS-Economic Research Service, QCEW-Quarterly Census of Employment and Wages.

Analysis units: U.S. counties

3 Hypotheses

  1. a) Pre-recession income inequality and other demographic economic and industrial factors affect drop at county level. drop(reciprocal of resistance) is positively related to income inequality
  2. b) drop (reciprocal of resistance) is spatially autocorrelated.
  3. c) drop (reciprocal of resistance) is related to characteristics of a county and characteristics (here focused on drop) of neighboring counties.at county level4 Approaches

Approaches

  1. a) drop(reciprocal of resistance) is positively related to income inequality and related with other characteristics of a county

Ordinary least squares regression

  1. b) drop (reciprocal of resistance) is spatially autocorrelated.

Global Moran’s I and Anselin Local Moran’s I

  1. c) drop (reciprocal of resistance) is on drop of neighboring counties.

Spatial lag model in Geoda

5 Results

  1. a) OLS regression

From the table below, we can see the Gini coefficient, poverty rate, and the share of aggregate income held by HH earning $200, 000 or more are not significant to predict drop. Therefore the prerecession income inequality and income distribution are not correlated with drop (reciprocal to resistance).

Besides income inequality, there are several other significant variables, for example, population growth rate from 2001-2005, % Black or African American, % Hispanic or Latino, % pop with Bachelor’s degree or higher, natural amenity scale, employment share of industries like manufacturing, wholesale trade, retail trade, transportation and warehousing, information, finance and insurance, educational services, health care and social assistance, accommodation and food services, government and government enterprises.

Table 2 Full model of drop  
Drop Full Model
Gini coefficient in 2000 -0.322
  (-1.60)
Poverty rate 0.153
  (1.69)
% Aggregate income held by HH earning  200K or more 0.166
  (1.48)
Population growth rate 2001 -2005 0.205***
  (3.61)
%Black or African American 0.0542*
  (2.33)
%Hispanic or Latino -0.113***
  (-5.16)
% persons with Bachelor’s degree or higher -0.132*
  (-2.09)
%Total: 20 to 29 years -0.0218
  (-0.21)
%Total: 30 to 49 years -0.0731
  (-0.58)
%Female Civilian Labor Force Participation -0.0260
  (-0.37)
Natural amenity scale 0.00728***
  (6.13)
Social capital index 2005 -0.00386
  (-1.32)
Median value  All owner-occupied housing units -4.13e-08
  (-0.47)
Share of dividend, interest and rent 0.0632
  (0.99)
Farm employment -0.0183
  (-0.29)
Forestry, fishing, and related activities 0.163
  (0.99)
Mining, quarrying, and oil and gas extraction -0.0548
  (-0.49)
Utilities 0.0237
  (0.06)
Construction 0.0631
  (0.64)
Manufacturing -0.112*
  (-2.32)
Wholesale trade -0.580***
  (-4.58)
Retail trade -0.620***
  (-5.99)
Transportation and warehousing -0.337***
  (-3.47)
Information -0.590**
  (-2.62)
Finance and insurance -0.710***
  (-5.04)
Real estate and rental and leasing 0.203
  (0.87)
Professional, scientific, and technical services 0.0355
  (0.15)
Management of companies and enterprises -0.207
  (-0.53)
Administrative and support and waste management and remediation services -0.0719
(-0.59)
Educational services -0.894***
  (-4.85)
Health care and social assistance -0.215***
  (-4.26)
Arts, entertainment, and recreation -0.00162
  (-0.01)
Accommodation and food services -0.244**
  (-2.92)
Government and government enterprises -0.231***
  (-4.12)
_cons 0.527***
  (5.19)
N 2771
adj. R2 0.199
Log likelihood 2383.06
AICc -4696.13
Schwarz criterion -4488.7
Jarque Bera ***Non-normality
Breusch-Pagan tes ***Non-stationary
White Test *** heteroscedasticity
For weight Matrix Queen Contiguity 1st order  
Moran’s I (error) 10.5178***
Lagrange Multiplier (lag) 506.9370***
Robust LM (lag) 19.9202***
Lagrange Multiplier (error) 1.#INF ***
Robust LM (error) 1.#INF ***
Lagrange Multiplier (SARMA) 1.#INF ***
For weight Matrix Queen Contiguity 1st and 2nd order  
Moran’s I (error) 7.9411***
Lagrange Multiplier (lag) 569.0604***
Robust LM (lag) 18.1515***
Lagrange Multiplier (error) 1.#INF ***
Robust LM (error) 1.#INF ***
Lagrange Multiplier (SARMA) 1.#INF ***

Significant results for Jarque Bera, Breusch-Pagan test and White Test show non-normality, non-stationary, and heteroscedasticity exist. The value of Moran’s I shows there is spatial autocorrelation. While the five Lagrange Multiplier test statistics are reported in the diagnostic output. The first set of tests is between Lagrange Multiplier (LM), which tests for the presence of spatial dependence, and Robust LM, which tests which if either lag or error spatial dependence could be at work. The second set of tests is lag or error. Lag refers to spatially lagged dependent variable, whereas error refers to spatial autoregressive process for the error term. The first two (LM-Lag and Robust LM-Lag) pertain to the spatial lag model as the alternative. The next two (LM-Error and Robust LM-Error) refer to the spatial error model as the alternative. In my results, LM-Lag and LM-Error are all significant while Robust LM-Lag is less significant (Prob 0.00002 in Spatial lag with queen contiguity 1st order /0.00001 for 1st and 2nd order) than Robust LM-Error (Prob 0.00000). The last test, LM-SARMA, relates to the higher order alternative of a model with both spatial lag and spatial error terms. This test is only included for the sake of completeness, since it is not that useful in practice.

According to Anselin 2005, I should use spatial error model. But I chose to spatial lag model. “In the rare instance that both would be highly significant, go with the model with the largest value for the test statistic. However, in this situation, some caution is needed, since there may be other sources of misspecification. One obvious action to take is to consider the results for different spatial weights and/or to change the basic (i.e., not the spatial part) specification of the model.”

  1. b) Spatial autocorrelation

From both the global Moran’s I and Anselin Local Moran’s I, drop and the Gini coefficient are spatial autocorrelated.

Table 3 Spatial Autocorrelation
Variable Global Moran’s Index1 z-score  
The Gini coefficient 2000 0.46732 91.45 clustered
drop 0.1253 23.82 clustered
Note: 1 Only the continental U.S. counties are used to calculate Global Moran’s I and Anselin Local Moran’s I. All the options are left as default. 2,840 counties are used for drop, and resilience. Conceptualization of Spatial Relationships: Inverse distance, Standardization : None
2This is different from the first time I calculated, which is 0.453298 with z-score 88.71. Another same shapefile gives 0.44 with z-score 83.79.

local moran I drop 3 local moran I gini00

Figure 3 and 4 Local Moran’s I of drop and the Gini coefficient 2000.

c) Spatial lag model

To deal with the spatial autocorrelation, I create spatial weights using queen contiguity. Contiguity based weights are weights based on shared borders (or vertices) instead of distance. The queen criterion determines neighboring units as those that have any point in common, including both common boundaries and common corners. Therefore, the number of neighbors for any given unit according to the queen criterion will be equal to or greater than that using the rook criterion which eliminates corner neighbors i.e. tracts that don’t have a full boundary segment in common.

I tried 3 orders of queen contiguity.

queen contiguity

A county’s drop is affected by drop in neighboring counties. Because spatially weighted drop is significant in all three models. The coefficients are all positive, indicating drop in a specific county is positively related to its neighboring counties. Hence a county surrounded by high drop counties will be high in drop while a county surrounded by low drop counties is likely to be low in drop.

Moreover, the coefficient of the weighted drop increase with the contiguity order. My explanation is that when the drop of a county is affected by more counties in its neighborhood, this region would be overall vulnerable or robust, therefore the coefficients are large.

Models Log likelihood AICc Schwarz criterion W_drop
OLS 2383.06 -4696.13 -4488.7  
Spatial lag 1 (1st queen contiguity order) 2416.53 -4761.07 -4547.71 0.215***
Spatial lag 2 (2nd queen contiguity order) 2426.12 -4780.24 -4566.88 0.356***
Spatial lag3 (3rd queen contiguity order) 2425.8 -4779.59 -4566.24 0.436***

 

 

An increase in log likelihood from 2383.06 (OLS) to 2416.53 (Spatial Lag with 1st order neighbors), 2426.12 (Spatial lag with 1st and 2nd order neighbors), 2425.8 (Spatial lag with 1st, 2nd , 3rd order neighbors). Compensating the improved fit for the added variable (the spatially lagged dependent variable), the AIC (from -4696.13 to -4761.07/-4780.24/-4779.59) and SC (from -4488.7 to -4547.71/-4566.88/-4566.24) both decrease relative to OLS, again suggesting an improvement of fit for the spatial lag specifications.

Among the three spatial lag models, the spatial lag with 1st and 2nd order neighbors has the highest log likelihood and lowest AICc and Schwarz criterion. Spatially lagged drop on 1st and 2nd order neighbors best fit the model.

Limited number of Diagnostics are provides with the ML lag estimation of spatial lag model 2(using Queen contiguity 1st and 2nd order). A Breusch-Pagan test for Heteroscedasticity in the error terms. The highly significant value of 587.2494 suggests that heteroscedasticity is still a serious problem.

The second test is an alternative to the asymptotic significance test on the spatial autoregressive coefficient, it is not a test on remaining spatial autocorrelation. The Likelihood Ratio Test is one of the three classic specification tests comparing the null model (the classic regression specification) to the alternative spatial lag model. The value of 86. 1133 confirms the strong significance of the spatial autoregressive coefficient.

The three classic tests are asymptotically equivalent, but in finite samples should follow the ordering: W > LR > LM. In my example, the Wald test is 9.63145^2 = 92.8 (rounded), the LR test is 86. 1133 and the LM-Lag test was 506.9370, which is not compatible with the expected order. This probably suggests that other sources of misspecification may invalidate the asymptotic properties of the ML estimates and test statistics. Given the rather poor fit of the model to begin with, the high degree of non-normality and strong heteroscedasticity, this is not surprising. Further consideration is needed of alternative specifications, either including new explanatory variables or incorporating different spatial weights.

 

6 Significance

Understanding what factors affect county level resistance could offer some insights that local policies to promote regional resistance and prevent future downturns.

My results show pre-recession income inequality is not a significant factor in explaining drop. Drop, which refers to the percentage of employment loss during the Great Recession, is significantly explained by prerecession population growth rate, race and ethnicity, educational attainment, natural amenity scale and several industries. Results in full model of drop show that counties with a lower population growth rate from 2001 to 2005, a lower percentage of Black or African American, a higher percentage of Hispanic or Latino population, a higher percentage of people with Bachelor’s degree or higher, a lower natural amenity scale, and higher employment shares in manufacturing, wholesale trade, retail trade, transportation and warehousing, information, finance and insurance, educational services, health care and social assistance, accommodation and food services and government and government enterprises would be more resistant and experience a low employment deviation from expected growth path during the Great Recession.

Further investigation is needed for industries such as manufacturing, retail trade, information, finance and insurances, educational services, health care and social assistance that slow down drop.

7 Learning

I learned principal components analysis in R which shrink the size of the explanatory variables. But I find it’s hard to explain in the GWR results. Moreover, my GWR still shows a low local r squared (only less than 20 counties has R squared higher than 0.3). So there might be misspecification in my original model.

Spatial lag and spatial error model in GeoDa. I also did spatial error model, but results are not shown here. The spatial lag model with 1st and 2nd queen contiguity order yield higher loglikelihood than any of the spatial error model. But due to my previous LM test for lag and error, spatial error is more significant and spatial error model should have better result.  This needs more investigation. I find the tools very useful. I am curious to find a way to test the spatial lag in income inequality in neighboring counties.

Moreover, I learned a lot from the tutorials and projects from my classmates. They are wonderful!

 

8 Statistics

Hotspot:

Pros: identify the clustering of values, show patterns, easy to apply, fits both binary and continuous values

Cons: sensitivity to geographic outliers, edge effect, sample size >30, univariate analysis, couldn’t identify outliers as Local Moran’s I, sensitive to spatial distribution of points

Spatial autocorrelation (Local Moran’s I):

Pros: similar to hotspot analysis, could identify outliers (high-low outliers, low-high outliers)

Cons: similar to hotspot analysis

Regression (OLS, GWR):

OLS gives a general coefficients, a global relationship. It is multivariate analysis and could include many explanatory variables.

GWR builds a local regression equation for each feature in the dataset.

Cons: Maps of coefficients could only show one relationship at a time, cannot include large numbers of explanatory variables.

Multivariate methods (PCA):

Pros: could reduce the size of explanatory variables, resulted PC (linear combination of the candidate variables) capture the most variation of the dataset

Cons: it’s hard to explain the coefficients/results when more than one pcs work in the regression.

Learning: Resulted principal components capture the most variation of the dataset, however, it may not best capture the variation in the dependent variables.

Reference:

Anselin, Luc. “Exploring spatial data with GeoDaTM: a workbook.” Urbana 51 (2004): 61801.

Han, Yicheol, and Stephan J. Goetz. “The Economic Resilience of US Counties during the Great Recession.” The Review of Regional Studies 45, no. 2 (2015): 131.

Rajan, Raghuram. Fault lines. HarperCollins Publishers, 2012.

Rupasingha, Anil, Stephan J. Goetz, and David Freshwater. “The production of social capital in US counties.” The journal of socio-economics 35, no. 1 (2006): 83-101. Data Retrieved on March 23rd, 2016  http://aese.psu.edu/nercrd/community/social-capital-resources

U.S. Census Bureau, Current Population Survey, Annual Social and Economic Supplements “H2 Share of aggregate income received by each fifth and top 5 percent of households all races: 1967 to 2014”,” H4 Gini ratios for households by race and Hispanic origin of householder: 1967 to 2014”, (2014a) Retrieved on May 17th , 2016 https://www.census.gov/hhes/www/income/data/historical/inequality/

U.S. Census Bureau, Current Population Survey, Annual Social and Economic Supplements  Table 2 Poverty status of people by family relationship, race, and Hispanic origin: 1959 to 2014” (2014b) Retrieved on May 17th , 2016 https://www.census.gov/hhes/www/poverty/data/historical/people.htmlU.S. Census Bureau, Historical Income Tables Footnote. (2016a). Retrieved on May 17th, 2016 from http://www.census.gov/hhes/www/income/data/historical/ftnotes.html

U.S. Census Bureau Historical Poverty Tables-Footnote. (2016b). Retrieved on May 17th, 2016 from https://www.census.gov/hhes/www/poverty/histpov/footnotes.html

My dependent variable will be drop developed in Han and Goetz (2015). It is reciprocal for resistance, which is calculated as the amount of impulse that a county experiences from a shock (the percentage of deviation of the actual employment from the expected employment during the Great Recession).

droprebound

dropequation

I want to find what factors affect drop. Then I run an OLS regression using drop as the dependent variables and following independent varaibles:

Income inequality: the Gini coefficient

Income distribution: poverty rate and the share of aggregate income held by households earning $200,000 or more

Control variables: Population growth rate from 2001-2005, % Black or African & American (2000), % Hispanic or Latino (2000)

Capital Stock variables:

Human Capital: % population with Bachelor’s degree or higher, age group (20-29, 30-49), female civilian labor force participation rate (2000)

Natural Capital: natural amenity scale (1999)

Social Capital: social capital index (2005)

Built Capital(2000): median housing value (2000)

Financial Capital (2000): share of dividends, interest and rent(2000)

Economic structure:

Employment share of 20 two-digit NAICS industries (manufacturing, construction, etc. other services (except public administration) omitted)

 

Significant variables and diagnostic tests results are shown in the following table.

Ind. Var Coefficient Ind. Var Coefficient
Population growth rate 2001-2005 0.205*** Transportation and warehousing -0.337***
%Black or African American 0.0542* Information -0.590**
%Hispanic or Latino -0.113*** Finance and insurance -0.710***
% pop with Bachelor’s degree or higher -0.132* Educational services -0.894***
Natural amenity scale 0.00728*** Health care and social assistance -0.215***
Manufacturing -0.112* Accommodation and food services -0.244**
Wholesale trade -0.580*** Government and government enterprises -0.231***
Retail trade -0.620*** Adjusted R-squared: 0.199 AICc: -4693.152798
#obs:2770 Koenker BP Statistics *** Joint Wald Statistic*** Jarque-Bera Statistic ***

Among the significant variables, I want to explore population growth rate from 2001 to 2005 because it has the larges coefficient in size among control variables.

Moreover, since the Konker BP statistics are significant, the results are non-stationary. Then I run the GWR for population growth rate 2001-2005.

GWR

The following maps are the distribution and GWR coefficients of population growth rate 01-05

diistpopgrgwrpopgr2

For the distribution of population growth rate 01-05, blue regions show negative population growth while the rest counties are all growing in population. Orange and red counties are high in population growth rate.

For the GWR coefficients, negative relationship between population growth rate and drop exist in blue, grey and yellow regions – higher population growth, lower drop (employment loss)

Positive relationship exist in orange and red regions – higher population growth, higher drop (employment loss)

My explanation are :

In an expanding economy (regions high in population growth), there are more people marginally attached to the labor market. They are easily fired in the Great Recession.

In regions with negative population growth, there more deaths than births and population aging rises.

  • Older people have higher accumulated savings per head than younger people, but spend less on consumer goods.
  • Less available active labor
  • An increase in population growth will decrease employment loss.

However, my local R-squared, range from 0-0.620 with the mean 0.0247. Only in Orange County and San Deigo, CA, local R-squared is higher than 0.5, 0.603 and 0.620 separately. This situation (low local R-squared) happened to all my other significant control and capital stock variables ()  There might be misspecification problems in my model, I will try adding new  explanatory variables.

Population growth rate

  • Local R-squared, 0-0.620, mean 0.0247
  • Only in 06059 Orange County and 06073 San Deigo, CA, local R-squared is higher than 0.5, 0.603 and 0.620 separately

Natural amenity scale

  • Local R-squared, 0-0.519, mean 0.0216
  • 12087 Monroe County, Florida 0.519

% Black or African American

  • Local R-squared, 0-0.347, mean 0.0326
  • 26095 Luce county, Michigan, 0.347

% Hispanic or Latino

  • Local R-squared, 0-0.379, mean 0.0276
  • 35029, Luna County, New Mexico, 0.379

% pop with Bachelor’s degree or higher

  • Local R-squared, 0-0.391, mean 0.055
  • 06073 San Deigo, CA, 0.391

lomipopgr

Goal: Where does statistically significant spatial clusters of high values (hot spots) and low values (cold spots) of drop lie?

Variable: drop (employment loss)

  • Monthly employment Jan 2003- Dec.2014
  • Data availabl:2840 counties
  • drop

Tool : Hot spot analysis

Step:

  • Add Mean center of population.shp (3141 counties), U.S. county.shp (3141 counties), drop value (2840 counties)
  • Join the drop to the mean center, export as a new layer file.
  • Project the data
  • Run Hot Spot Analysis (Default for Conceptualization of Spatial Relationship, and Distance Method)
  • Make a Map
  1. Distribution of drop, using natural Break

drop 3141

2 Hot spot Analysis using 3141 counties.

  • Low employment loss gathers in north part
  • High employment loss lies in west and southeast part

Hotspotdrop 3141

 

Questions:

  • Previous map is made of 3141 counties, however, there are only 2840 counties with drop value. Missing value will be noted as 0, will this affect the result?
  • Hawaii and Alaska are included. Will exclusion of HI and AK affect the result?

Both: Yes

  • Redraw the map
  • Only keep the matched records – 2838 counties
  • Exclude Hawaii and Alaska

 

3. Hot spot Analysis using 2838 counties

Hotspot drop 2838

4. Hot spot Analysis using U.S. contiguous counties, Hawaii and Alaska removed.

Hotspot drop contiguous

Conclusion: Hot Spot analysis is sensitive to spatial outlier. It only identifies low value clusters or high value clusters. What if I want to find an outlier, for example low value unit among high value cluster?

I should use spatial autocorrelation tool- Anselin Local Moran’s I

1 Research Question

Counties reacted differently towards the Great Recession from Dec. 2007 to June 2009. Economic resilience is defined to measure the performance of counties. This research focused on waht contributes to economic resilience of a county. Especially, how does income inequality affect community economic resilience?

Definition:

Income inequality refers to the uneven manner of income distribution. Gini Coefficient is used to measure income inequality in this analysis. Gini Coefficient measures the ratio of area between lorenz curve and the 45 degree line (perfect equality).

Economic resilience refers to the regional ability to absorb and adjust to an external shock (recession, natural disaster, etc.) Martin (2012) defined dimensions of economic resilience, which two offer guidance for measurement, resistance and recovery(the other two is reorientation and renewal). Resistance means the sensitivity of reaction towards an exogenous shock and recovery shows the speed and degree of recovery.

Rising income inequality might pose an explanation to understand what is behind the great recession and why regions react differently. Economic growth theory shows income inequality is related to economic growth.

  • Higher inequality retards growth in poor countries and encourages growth in richer places (Barro, 2000).
  • Fallah and Patridge (2007) concludes positive inequality-growth link in the urban sample with the opposite in nonmetro case.
  • Inequality leads to lower productivity, more instability, lower efficiency and lower growth (Stiglitz, 2012, Chapter 4).

The assumed relationship is

  1. Income inequality(pre-recession, 2000) affects economic resilience.
  2. Economic resilience affects income inequality later on (Five year average of 2007-2011 include the recession time). This relationship seems to be weak in rationale. It will not be tested in the analysis.
  3. Simultaneous relationship between income inequality and economic resilience. The instrumental variable is hard to find, hence this relationship will not be tested.

The relationships have not been determined yet.

Spatial autocorrelation is assumed to act in the relationship. The effect of income inequality or resilience or demographic or economic factors may not be limited within a region but attenuate with distance. Resilience in a county might be affected by its own characteristics as well as the surrounding counties. (Confused about the difference between GWR, spatial lag and spatial error model.)

1) Identify if counties will be affected by neighboring counties, i.e. spatial clustering of Income Inequality for year 1990, 2000, and ACS 2007-2011, and Economic Resilience/Drop/Rebound.

2) Identify the impact of income inequality on economic resilience or inverse relationship. The simultaneous relationship is hard to test because I haven’t found an IV which affect income inequality but not economic resilience and another which affect economic resilience but not income inequality.

3) How demographic, economic and industrial factors affect income inequality and economic resilience, especially the role of  rural/urban?

4) Spatial Lags or Spatial Errors model

2 Dataset

metadata

The cross-sectional data covers 3141 counties in the U.S..  Data for pre-recession time period of  2000 and 2001, are used. Income inequality of year 2000 and ACS 5-Yr Esitmators are used. Details are listed below.

Key variable to measure economic resilience is drop, rebound and resilience, to measure income inequality is Gini coefficient.

Han and Goetz(2015) developed a one number measurement to measure economic resilience ,which was a ratio combined with drop (shows resistance) and rebound(recovery), and called economic resilience. Monthly employment data from Bureau of Labor statistics for 2003-2014 is used to calculate.

resiliencehttp://blogs.oregonstate.edu/geo599spatialstatistics/wp-admin/post.php?post=1605&action=edit

Formulations:

resilience formula

Gini Ceofficient is calculated using Household Income (group means)  from 1990 and 2000 Decennial Census, and American Community Survey 2007-2011 via R using a package inequal. Gini Coefficient provided by American Community Survey for the first time using individual data is used. Income inequality calculate for year 2000 and provided by ACS for 2007-2011 are used in the research.http://blogs.oregonstate.edu/geo599spatialstatistics/wp-admin/post.php?post=1605&action=edit

Key explanatory variable: economic structure(all twenty 3-digit NAICS industry):

Economic Structure : Location Quotients for ten industries chosen by looking at the national level data of annual employment from 2000. They are high in growth

 

3 Hypotheses

1) There is spatial clustering in income inequality and economic resilience.

2) Demographic economic and industrial factors affect income inequality and economic resilience. The relationship differs across rural/urban.

3) Not sure if it is spatial lag or spatial error.

4 Approaches

1) Identify the spatial clusterings of Income Inequality and Economic Resilience: Hot-spot analysis, Global Moran’s I and Anselin Local Moran’s I

2) Identify the impact of income inequality on economic resilience or inverse relationship: OLS, GWR

3) How demographic, economic and industrial factors affect income inequality and economic resilience, especially the role of rural/urban places matter: OLS, GWR

4) Spatial Lags or Spatial Errors model: GeoDa

 

5 Expected outcome

1) Maps of hot spots, clustering of income inequality and economic resilience

2) Statistical relationships between income inequality and economic resilience

3) Statistical relationships between income inequality, economic resilience and other demographic variables.

4) Spatial lag model or spatial error model which fits the statistical relationship of income inequality and economic resilience

 

6 Significance

There are papers discussing income inequality and economic resilience, but little work is done to explore the relationship between income inequality and economic resilience. No spatial analysis so far.

7 Your level of preparation

(a) Arc-Info, medium

(b) Model builder and/or GIS programming in Python, none

(c) R, medium

Reference:

Barro, Robert J. “Inequality and Growth in a Panel of Countries.” Journal of economic growth 5, no. 1 (2000): 5-32.

Fallah, Belal N., and Mark Partridge. “The elusive inequality-economic growth relationship: are there differences between cities and the countryside?.” The Annals of Regional Science 41, no. 2 (2007): 375-400.

Stiglitz, Joseph E. “The price of inequality (London, Allen Lane).” (2012).

Peters, David J. “Income Inequality across Micro and Meso Geographic Scales in the Midwestern United States, 1979–20091.” Rural Sociology 77, no. 2 (2012): 171-202.