The first release v0.0.2
of CountyPlus dataset is available now! You are welcome to download and explore the dataset.
- Add a separate file of constructed county-level net worth shocks since 2004
- Add spatial weighting matrices for spatial analysis
- First-neighbor adjacency weight
- Inverse distance weight (Haversine formula using latitude and longitude)
- Release
v0.0.3
(expected: Jan 2025):- Add variables of local government expenditure, esp. defense expenditure which is a widely used instrument
- Larger coverage of consumption estimates, and distinguish between durable and non-durable goods consumption.
CountyPlus dataset is an open-source county-level panel dataset for economic and social science research. It consists of 3000+ U.S. counties years from 2003 to 2019 while covering a broad collection of county-scope variables:
- County geographic and demographic characteristics
- Household balance sheets e.g. holdings by asset type
- Household net worth shocks and a Bartik instrument
- Household income and consumption
- Local sales tax and taxable consumption
- Local labor market indicators such as wage and employment
- Local house market indicators such as house price and house ownership
- Local credit supply e.g. home mortgage loans
- Local industry size, payroll and industry diversity
- Indicators of local economic frictions
- collateral constraint
- Downward Nominal Wage Rigidity (DNWR)
The dataset uses all public available data sources for the best replicability. This GitHub repo provides detailed data documentation for CountyPlus dataset including:
- By data source cleaning instructions, institutional facts, explanation, and programming code
- Documentations and programs for merging to get the final dataset
One can replicate the whole dataset and every sub-project following the instruction and using program files. On the current stage, we are actively extending the dataset to cover more variables and time periods. If you have some questions, comments on this dataset, or would like to see some new variables related to your research, you are more than welcome to contact us. We very appreciate your valuable comments.
The rest of this README introduces how this repo is organized.
This repo has two main folders:
src/
: this directory saves the Stata do-files that merge the outputs of the sub-projects, after-merge process, and export the finalCountyPlus.dta
dataset. One may usemain.do
to call the whole pipeline.-
id.do
: Table: ID (primary key) -
ap.do
: Table: Aggregate prices -
bs.do
: Table: Household balance sheet -
cp.do
: Table: Consumption -
cs.do
: Table: Credit supply -
dg.do
: Table: Demography -
lg.do
: Table: Land and Geography -
yl.do
: Table: Income, poverty, and labor market -
postproc.do
: Post-merging processes
-
by-data-source/
: this directory saves the documentation by data source and corresponding program files. The output of these “sub-projects” are used in the data merging. The following is a list of all data sources (checked box: has uploaded to GitHub; unchecked box: still on the way)-
American Community Survey/
: ACS data, to obtain estimate median housing value in 2019 -
County Land Areas/
: County area and latitude/longitude data -
CPI All Urban Consumers/
: Inflation -
Current Business Pattern/
: To construct county-industry level, and tradable/non-tradable sector level employment & employment. Also to construct DNWR measure (fraction of wage cut prevented, FWCP) -
Fed Flow of Funds - Balance Sheet of Household and Nonprofit Organizations 1952-2021/
: To obtain aggregate household balance sheet data -
Fed Flow of Funds - EFA - Household Debt/
: To estimate household debt-to-income ratio -
Federal Housing Finance Agency/
: Housing Price Index (HPI) data -
FIPS/
: FIPS code, serving as the primary key of all sub-projects -
Home Mortgage Disclosure Act/
: HMDA data, for local credit supply data -
ICE BofA US Corporate Index/
: aggregate bond price index, for constructing the Bartik instrument to the net worth shock -
Land Unavailability/
: Land unavailability index data by Lutz & Sand (2023), serving as the instrument to housing supply -
Local Area Unemployment Statistics/
: LAUS unemployment data, for employment population and rate -
Mian Sufi 2014 Tradabilty/
: Strategy of industry classification to tradable, non-tradable, construction, and other. All years code are harmonized. -
NAICS/
: NAICS code, we harmonized the different versions to be consistent with Mian & Sufi (2014) -
NASDAQ Composite Index/
: aggregate equity asset price, for constructing the Bartik instrument to the net worth shock -
National State and County Housing Unit Totals/
: Census Bureau housing and population statistics -
Personal Consumption Expenditure/
: BEA state-level consumption -
QCEW County-MSA-CSA Crosswalk/
: Crosswalk of county, MSA and CSA; This is not part of the data release but for users to aggregate the county level data to MSA or CSA levels. -
Small Area Income and Poverty Estimates/
: SAIPE data, for family median income and poverty indicators -
Survey of Income/
: SOI data by IRS, for constructing the household balance sheet and net worth shock -
USDA Educational Attainment for adults/
: for education demographics -
Vintage Population Estimates for Demographics/
: Population estimates -
Sales Tax/
: County sales tax revenue, taxable consumption, and/or gross sales. These data are collected from state department of revenues, and are used to estimate household consumption of counties-
1 Alabama/
-
4 Arizona/
-
5 Arkansas/
-
6 California/
-
8 Colorado/
-
12 Florida/
-
17 Illinois/
-
18 Indiana/
-
19 Iowa/
-
22 Louisiana/
-
27 Minnesota/
-
29 Missouri/
-
31 Nebraska/
-
32 Nevada/
-
36 New York/
-
37 North Carolina/
-
38 North Dakota/
-
39 Ohio/
-
42 Pennsylvania/
-
45 South Carolina/
-
47 Tennessee/
-
49 Utah/
-
50 Vermont/
-
51 Virginia/
-
53 Washington/
-
55 Wisconsin/
-
56 Wyoming/
-
-
Figure: Household leverage ratio (debt/housing wealth) over time
Figure: Household balance sheet structure over time
This project is licensed under the MIT License. Users are welcome to create their own data builds if they need to adjust any underlying assumptions in the data processing. However, if CountyPlus is cited in research, all modifications should be explained to adhere to academic ethics. We are not responsible for any errors, bias, or potential data manipulation resulting from such modifications.
Please cite this dataset by:
@unpublished{
author = {Cheng Ding and Tianhao Zhao},
title = {Frictions, Net Worth Shocks, and Heterogeneous Impacts},
note = {Available at SSRN: \url{https://ssrn.com/abstract=4915272}},
month = aug,
year = {2024},
}