-
Notifications
You must be signed in to change notification settings - Fork 2
National Household Travel Survey
Four NHTS datasets were used for VERSPM modules: Hh_df.rda
, Per_df.rda
, Veh _df.rda
, and Hhtours_df.rda
. These datasets are output after the VE model is built (i.e., “modules\VE2001NHTS\R\Make2001NHTSDataset.r”). Prior to the VisionEval model building process, five raw datasets, Hh_df.rda
, Per_df.rda
, Veh_df.rda
, ToursByHh_df.rda
, and Dt_df.rda
, were used as inputs to produce the four NHTS datasets. In SDRSPM, the San Diego 2016 Household Travel Survey (HTS) data was used to produce the five raw datasets, as implemented in HTS-NHTS.ipynb. Note that some variables in the raw datasets were not used in VERSPM; a placeholder was used for each of these variables.
ToursByHh_df
is the raw NHTS dataset from the package VE20021NHTS. It is expanded with attributes calculated during the NHTS dataset creation process (Make2001NHTSDataset) and is saved as HhTours_df.rda
. This dataset is used in the modules DivertSOVTravel (VEHouseTravel) and AssignDemandManagement (VELandUse). The effective attributes include Distance, Persons, Trips, Trptrans, and IsHhVehTour; other attributes act as placeholders. ToursByHh.xlsx
contains the relation between NHTS and HTS attribute values in Sheet1.
There are 27 tours modes in the raw NHTS TRPTRANS. The TRPTRANS produced from Hhtours_df
has 12 tour modes, while there were 11 tour modes in HTS. The following four steps align the 11 SANDAG HTS tour modes with the 12 NHTS 2001 tour modes.
-
Step 1: Convert HTS tour mode to NHTS tour mode code based on Table 2-1.
Table 2-1 HTS-NHTSTour Mode Conversion
HTS Tour Mode
HTS Code
NHTS Tour Mode
NHTS Code
Auto SOV
1
Intermediate General Auto
1
Auto 2 Person
2
Intermediate General Auto
1
Auto 3+ Person
3
Intermediate General Auto
1
Walk
4
Walk
26
Bike/Moped
5
Bicycle
25
Walk Transit
6
Bus
10
PNR-Transit
7
Bus
10
KNR-Transit
8
Bus
10
TNC-Transit
9
Bus
10
MAAS (Taxi, TNC-Single, TNC-Shared)
10
Taxi
22
School Bus
11
School Bus
12
-
Step 2: Cross-reference HTS auto modes to NHTS tour modes for auto tours where a household vehicle is used (Table 2-2).
Table 2-2 Auto Tour Mode Conversion HTS vs NHTS
HTS Converted Tour Mode
HTS Code
NHTS Tour Mode
NHTS Code
Intermediate General Auto
1
Auto*
1
Intermediate General Auto
1
LtTrk
2
Intermediate General Auto
1
Other Truck
5
Intermediate General Auto
1
RV
6
Intermediate General Auto
1
Motorcycle
7
*Auto here refers to vehicle type of basically passenger cars.
-
Step 3: According to the transit tours conversion in Step 1, transit modes were further transformed into four additional modes as in Table 2-3, which followed two steps:
-
In HTS raw trip data, every trip contains one or multiple modes, recorded as mode1, mode2... For simplification purposes, mode1 represents the entire trip mode.
-
For all transit tours, as long as there is at least one trip leg that belongs to Light / Intercity / Other Rail (i.e., HTS code 39, 41, 42) or Coaster Line (i.e., HTS code 150), replace all the trip modes under the tour to Rail or Coaster Line based on the hierarchy of 150 > 41/42 > 39 (See Table 2-3). The updated trip modes are contained in the adj_mode column in
trips_debug _newid.csv
. The manual trip mode update is recorded indf_trip_rsg(change_mode_394142).xlsx
in the same folder.Table 2-3 TransitTour Mode Conversion HTS vs NHTS
HTS Trip Mode
HTS Code
NHTS Tour Mode
NHTS Code
Rail - Light
39
Street car/trolley
18
Rail - Intercity
41
Amtrack/inter city train
15
Rail - Other
42
Amtrack/inter city train
15
San Diego Coaster Line
150
Commuter train
16
-
Finally, the transit tour mode is updated to NHTS tour mode 15, 16 and 18.
-
-
Step 4: For auto tours that do not use a household vehicle, the NHTS data was used as a reference, and a Monte Carlo approach was used to impute the tour mode. The California auto trips made by California residents from a non-household vehicle in the NHTS tour dataset were selected to allow a larger HTS sample size. The output probability distribution by distance, number of persons per household, and tour mode are available in
mode share.csv
. The mode share generation is stored in the sheet imputaton data ofHhTours_df.xlsx
. The sample size is 242 when the HhVehUsed is equal to 2.There are four cities in California, LA (MSA 4472), Sacramento (MSA 6922), San Diego (MSA 7320), and San Jose (MSA 7362), surveyed for the 2001 NHTS. Those form the NHTS California resident samples. Other data estimations include:
- Merge RSG tour mode with the first trip mode (mode1) of HTS trip data that was used to estimate the vehicle type (since trip modes 6 to 16 represent trips where a household vehicle was used).
- Vehicle type is estimated at
SDRTS_Vehicle_Data_20170731(2016HTS).csv
. All vehicles which could not be classified, e.g., Honda Other, were classified as OthTrk. - Since vehtype was not used in VERSPM, car/van/SUV were all classified as car.
- Vehid, Disttowk, and Whyto were set as a uniform value since they are not in use in VERSPM modules.
- Vehicle type (VEHTYPE) is defined in rows 46-54 in Sheet1 of
ToursByHh_df.xlsx
.`
Per_df.rda
is the raw person dataset from the 2016/2017 Household Travel Survey (HTS). This dataset is used to prepare a dataset of person characteristics and used for the estimation of several VisionEval models.
-
The NHTS and HTS variable value comparison, e.g., WRKTRANS - transportation mode to work last week, was saved in Sheet1 of
Per_df_ori.xlsx
. -
HTS age groups differ from NHTS age groups, as shown in Table 2-4.
Table 2-4 NHTS and HTS Age Group
HTS AgeGroup
Converted HTS Age Group to Be in line with NHTS
NHTS AgeGroup
0-15
0-14
0-14
0-15
15
15-19
16-24
16-19
15-19
16-24
20-24
20-29
25-34
25-29
20-29
25-34
30-34
30-54
35-54
35-54
30-54
55-64
55-64
55-64
65+
65+
65+
In addition, to satisfy the Hh_df
life cycle (LIF_CYC) definition, the 0-14 and 15-19 groups were split to 0-5, 6-14, 15, 16-17, 18, and 19, respectively (subgroups). The split process can be found in Sheet1 of age bin calculation.xlsx
and is based on ABM person output. Since the HTS data does not contain the subgroups, a Monte Carlo simulation process split the age groups.
- Placeholders were used for COMMDRVR, NBIKETRP, NWALKTRP, USEPUBTR, WRKDRIVE, DTGAS, and DISTTOWK.
Veh_df.rda
is the raw vehicle dataset from the 2016/2017 HTS that is used to prepare a vehicle dataset for VisionEval vehicle model estimation.
- Placeholders were used for BESTMILE, EIADMPG, GSCOST, and VEHMILES.
Hh_df.rda
contains 77 variables that are used across several VERSPM modules. The Hh_df
was converted from HTS data.
-
The relation between NHTS and HTS variables can be found in Sheet1 of
Hh_df_ori.xlsx
. -
HBHRESDN was calculated by summing the number of household structures (hs) for each TAZ and dividing by the TAZ developable area, as specified in an ABM MGRA input file.
-
NHTS household income (HHFAMINC) contains 22 values, while HTS contains 6 (Table 2-5).
Table 2-5 NHTS and HTS Income Group
NHTS Household income Group
HTS Household income Group
-7=Refused
99 Prefer not to answer
01=<$5000
02=$5,000-$9,999
03=$10,000-$14,999
04=$15,000-$19,999
05=$20,000-$24,999
06=$25,000-$29,999
1 - Under $30,000
07=$30,000-$34,999
08=$35,000-$39,999
09=$40,000-$44,999
10=$45,000-$49,999
11=$50,000-$54,999
12=$55,000-$59,999
2 - $30,000-$59,999
13=$60,000-$64,999
14=$65,000-$69,999
15=$70,000-$74,999
16=$75,000-$79,999
17=$80,000-$99,999
3 - $60,000-$99,999
18=>=$100000
4 - $100,000-$149,999
5 - $150,000 or more
Similarly, the NHTS data was used as a reference, and a Monte Carlo simulation process imputed the three HTS income groups. There are 2112 California household samples in the NHTS. The output probability distribution by distance, number of persons per household, and tour mode are available in hhincome share.csv
. The income share calculation procedure was stored in Sheet income of Hh_df_ori.xlsx
.
-
HBHUR contains five values derived from Area_Type in
areatype_mgra.csv
:- C=Second City
- R=Rural
- S=Suburban
- T=Town
- U=Urban
-
Note that Suburban and Town refer to the same group in HTS, and NHTS HBHUR does not include Town (T). NHTS and HTS area types are listed in Table 2-6.
Table 2-6 NHTS and HTS Area Type
HTS Area_Type
HTS Loc_Type
NHTS HBHUR
Urban
Urban
U
National / State / Regional Park
Rural
R
Second City
Urban
C
Town Center
Urban
U
Suburban
Town
S
Major University
Urban
U
Rural
Rural
R
Military
Rural
R
Central Business District
Urban
U
Tourist Attraction
Rural
R
San Diego Bay
Rural
R
Port of Entry
Rural
R
Tribal
Rural
R
Village
Rural
R
-
A Monte Carlo simulation process was applied to HOMETYPE to split HTS res_type=2-Townhouse (attached house) into Hh_df 2=Duplex and 3=Rowhouse or townhouse.
-
The definition of NHTS Life cycle (LIF_CYC) is in Sheet1 of
Hh_df_ori.xlsx
, rows 116-127. Two approximations are applied:- Since the minimum adult age is 18, the youngest child 16-21 was modified to 16-17 to avoid conflict.
- Retirement is defined as age>=18 and EMPLY==4.
-
The person, drive, and work counts by age group (AGE_PX, DRV_PX, WKR-PX) are acquired from variables R_AGE, DRIVER, and WORKER in
Per_df
. -
The seven San Diego MAS codes are obtained from T:\ABM\ABM_FY22\RSM\VisionEval\Model\Update_Automations\SX\mgra13_msacat.csv.
-
Race was consolidated from the eight individual attributes in the HTS person data and hardcoded in
per_debug.csv
as HHR_RACE, which needs to be QC'd as well. Table 2-7 lists the NHTS and HTS race definitions:Table 2-7 NHTS and HTS Race
NHTS Race
HTS Race
-7=Refused
ethnicity_prefernot
01=White
ethnicity_white
02=African American, Black
ethnicity_black
03=Asian Only
ethnicity_asian
04=American Indian, Alaskan Native
ethnicity_amindian_alaska
05=Native Hawaiian, other Pacific Islander
ethnicity_hawaiian_pacific
06=Hispanic/Mexican Only
ethnicity_hispanic
17=Other specify
ethnicity_other
-
HH respondents are approximated as the first person (perid=1), which are all adults, from the HTS person dataset.
-
For simplicity, URBAN is approximated as 2 and 4 only. However, this attribute seems to not be used in VERSPM. Table 2-8 lists the NHTS and HTS urban type:
Table 2-8 NHTS and HTS Urban
NHTS URBAN
HTS URBAN
1=In an Urban cluster
-
2=In an urban area
2 - Urban
3=In an area surrounded by urban areas
-
4=Not in urban area
4- Rural
In VERSPM, the raw data Dt_df.rda
was prepared to create a data frame of tours by households, if not already created. In the SDRSPM, the ToursByHh_df.rda
already existed. As a result, Dt_df.rda
, containing HOUSEID, PERSONID, and VEHID, was imputed with placeholders to ensure the SDRSPM can be built.