You must be signed in to change notification settings - Fork 2
Setup and Configuration
To develop VisionEval scenarios and apply the models, users first need a runtime version. Users can construct a runtime version by cloning the SanVE repository to their local machine. The next step of the build process can proceed after setting up the following build-in data:
To customize SDRSPM, the following build-in data is replaced with local (e.g., San Diego) data:
- Default Census public use microsample (PUMS) data is designed for the Oregon region. The default data is updated with San Diego PUMS data for SDRSPM.
- VENHTS2001's module package includes build-in transportation survey data used to predict household travel in VERSPM. The raw survey data is updated with SANDAG 2016/17 HTS data.
ACS PUMS files have state-level Census 2000 data containing individual records of the characteristics for a 5 percent sample of people and housing units. The PUMS files contain geographic units known as super-Public Use Microdata Areas (super-PUMAs) and Public Use Microdata Areas (PUMAs). Each super-PUMA contains a minimum population of 400,000 and each PUMA contains a minimum population threshold of 100,000. Geographic equivalency files that show the relationship between the PUMA and standard Census 2000 geographic concepts (e.g., counties, etc.) are included.
- Download California PUMS5 data from the ACS website
- Process Raw PUMS5 using Process_2000_PUMS.R. The location filter in this R script can be used to subset the PUMS5 to San Diego County using SuperPUMA code.
- San Diego SuperPUMA code: ("06701","06702","06703","06704","06705")
- San Diego SuperPUMA code: ("06701","06702","06703","06704","06705")
- After running the script,
will be created. These files will be used to create the San Diego version of theHh_Df
- PUMS5: https://www.census.gov/data/datasets/2000/dec/microdata.html
- CreateEstimatedDataset: https://github.com/VisionEval/VisionEval/blob/master/sources/modules/VESimHouseholds/R/CreateEstimationDatasets.R
- Public Use Microdata Areas (PUMAs): https://www.census.gov/programs-surveys/geography/guidance/geo-areas/pumas.html
- Process_2000_PUMS.R: https://github.com/gregorbj/Process_2000_PUMS/blob/master/Process_2000_PUMS.R
- VisionEval GitHub https://github.com/VisionEval/VisionEval/tree/master/sources/modules/VESimHouseholds/inst/extdata
The VERSPM model is estimated based on the 2001 NHTS dataset. This survey collected 69,818 household samples across the country, including 221 San Diego resident samples. However, SANDAG conducted a region-wide household travel survey (HTS) in 2016/2017 that collected 6,199 samples. Therefore, given its larger sample size, the HTS data is used to build the SDRSPM.