Skip to content

National Household Travel Survey

esanchez01 edited this page Jan 19, 2022 · 3 revisions

Four NHTS datasets were used for VERSPM modules: Hh_df.rda, Per_df.rda, Veh _df.rda, and Hhtours_df.rda. These datasets are output after the VE model is built (i.e., “modules\VE2001NHTS\R\Make2001NHTSDataset.r”). Prior to the VisionEval model building process, five raw datasets, Hh_df.rda, Per_df.rda, Veh_df.rda, ToursByHh_df.rda, and Dt_df.rda, were used as inputs to produce the four NHTS datasets. In SDRSPM, the San Diego 2016 Household Travel Survey (HTS) data was used to produce the five raw datasets, as implemented in HTS-NHTS.ipynb. Note that some variables in the raw datasets were not used in VERSPM; a placeholder was used for each of these variables.

ToursbyHh_df

ToursByHh_df is the raw NHTS dataset from the package VE20021NHTS. It is expanded with attributes calculated during the NHTS dataset creation process (Make2001NHTSDataset) and is saved as HhTours_df.rda. This dataset is used in the modules DivertSOVTravel (VEHouseTravel) and AssignDemandManagement (VELandUse). The effective attributes include Distance, Persons, Trips, Trptrans, and IsHhVehTour; other attributes act as placeholders. ToursByHh.xlsx contains the relation between NHTS and HTS attribute values in Sheet1.

There are 27 tours modes in the raw NHTS TRPTRANS. The TRPTRANS produced from Hhtours_df has 12 tour modes, while there were 11 tour modes in HTS. The following four steps align the 11 SANDAG HTS tour modes with the 12 NHTS 2001 tour modes.

  • Step 1: Convert HTS tour mode to NHTS tour mode code based on Table 2-1.

    Table 2-1 HTS-NHTSTour Mode Conversion

    HTS Tour Mode

    HTS Code

    NHTS Tour Mode

    NHTS Code

    Auto SOV

    1

    Intermediate General Auto

    1

    Auto 2 Person

    2

    Intermediate General Auto

    1

    Auto 3+ Person

    3

    Intermediate General Auto

    1

    Walk

    4

    Walk

    26

    Bike/Moped

    5

    Bicycle

    25

    Walk Transit

    6

    Bus

    10

    PNR-Transit

    7

    Bus

    10

    KNR-Transit

    8

    Bus

    10

    TNC-Transit

    9

    Bus

    10

    MAAS (Taxi, TNC-Single, TNC-Shared)

    10

    Taxi

    22

    School Bus

    11

    School Bus

    12

  • Step 2: Cross-reference HTS auto modes to NHTS tour modes for auto tours where a household vehicle is used (Table 2-2).

    Table 2-2 Auto Tour Mode Conversion HTS vs NHTS

    HTS Converted Tour Mode

    HTS Code

    NHTS Tour Mode

    NHTS Code

    Intermediate General Auto

    1

    Auto*

    1

    Intermediate General Auto

    1

    LtTrk

    2

    Intermediate General Auto

    1

    Other Truck

    5

    Intermediate General Auto

    1

    RV

    6

    Intermediate General Auto

    1

    Motorcycle

    7

    *Auto here refers to vehicle type of basically passenger cars.

  • Step 3: According to the transit tours conversion in Step 1, transit modes were further transformed into four additional modes as in Table 2-3, which followed two steps:

    • In HTS raw trip data, every trip contains one or multiple modes, recorded as mode1, mode2... For simplification purposes, mode1 represents the entire trip mode.

    • For all transit tours, as long as there is at least one trip leg that belongs to Light / Intercity / Other Rail (i.e., HTS code 39, 41, 42) or Coaster Line (i.e., HTS code 150), replace all the trip modes under the tour to Rail or Coaster Line based on the hierarchy of 150 > 41/42 > 39 (See Table 2-3). The updated trip modes are contained in the adj_mode column in trips_debug _newid.csv. The manual trip mode update is recorded in df_trip_rsg(change_mode_394142).xlsx in the same folder.

      Table 2-3 TransitTour Mode Conversion HTS vs NHTS

      HTS Trip Mode

      HTS Code

      NHTS Tour Mode

      NHTS Code

      Rail - Light

      39

      Street car/trolley

      18

      Rail - Intercity

      41

      Amtrack/inter city train

      15

      Rail - Other

      42

      Amtrack/inter city train

      15

      San Diego Coaster Line

      150

      Commuter train

      16

    • Finally, the transit tour mode is updated to NHTS tour mode 15, 16 and 18.

  • Step 4: For auto tours that do not use a household vehicle, the NHTS data was used as a reference, and a Monte Carlo approach was used to impute the tour mode. The California auto trips made by California residents from a non-household vehicle in the NHTS tour dataset were selected to allow a larger HTS sample size. The output probability distribution by distance, number of persons per household, and tour mode are available in mode share.csv. The mode share generation is stored in the sheet imputaton data of HhTours_df.xlsx. The sample size is 242 when the HhVehUsed is equal to 2.

    There are four cities in California, LA (MSA 4472), Sacramento (MSA 6922), San Diego (MSA 7320), and San Jose (MSA 7362), surveyed for the 2001 NHTS. Those form the NHTS California resident samples. Other data estimations include:

    • Merge RSG tour mode with the first trip mode (mode1) of HTS trip data that was used to estimate the vehicle type (since trip modes 6 to 16 represent trips where a household vehicle was used).
    • Vehicle type is estimated at SDRTS_Vehicle_Data_20170731(2016HTS).csv. All vehicles which could not be classified, e.g., Honda Other, were classified as OthTrk.
    • Since vehtype was not used in VERSPM, car/van/SUV were all classified as car.
    • Vehid, Disttowk, and Whyto were set as a uniform value since they are not in use in VERSPM modules.
    • Vehicle type (VEHTYPE) is defined in rows 46-54 in Sheet1 of ToursByHh_df.xlsx.`

Per_df.rda

Per_df.rda is the raw person dataset from the 2016/2017 Household Travel Survey (HTS). This dataset is used to prepare a dataset of person characteristics and used for the estimation of several VisionEval models.

  • The NHTS and HTS variable value comparison, e.g., WRKTRANS - transportation mode to work last week, was saved in Sheet1 of Per_df_ori.xlsx.

  • HTS age groups differ from NHTS age groups, as shown in Table 2-4.

    Table 2-4 NHTS and HTS Age Group

    HTS AgeGroup

    Converted HTS Age Group to Be in line with NHTS

    NHTS AgeGroup

    0-15

    0-14

    0-14

    0-15

    15

    15-19

    16-24

    16-19

    15-19

    16-24

    20-24

    20-29

    25-34

    25-29

    20-29

    25-34

    30-34

    30-54

    35-54

    35-54

    30-54

    55-64

    55-64

    55-64

    65+

    65+

    65+

In addition, to satisfy the Hh_df life cycle (LIF_CYC) definition, the 0-14 and 15-19 groups were split to 0-5, 6-14, 15, 16-17, 18, and 19, respectively (subgroups). The split process can be found in Sheet1 of age bin calculation.xlsx and is based on ABM person output. Since the HTS data does not contain the subgroups, a Monte Carlo simulation process split the age groups.

  • Placeholders were used for COMMDRVR, NBIKETRP, NWALKTRP, USEPUBTR, WRKDRIVE, DTGAS, and DISTTOWK.

Veh_df.rda

Veh_df.rda is the raw vehicle dataset from the 2016/2017 HTS that is used to prepare a vehicle dataset for VisionEval vehicle model estimation.

  • Placeholders were used for BESTMILE, EIADMPG, GSCOST, and VEHMILES.

Hh_df.rda

Hh_df.rda contains 77 variables that are used across several VERSPM modules. The Hh_df was converted from HTS data.

  • The relation between NHTS and HTS variables can be found in Sheet1 of Hh_df_ori.xlsx.

  • HBHRESDN was calculated by summing the number of household structures (hs) for each TAZ and dividing by the TAZ developable area, as specified in an ABM MGRA input file.

  • NHTS household income (HHFAMINC) contains 22 values, while HTS contains 6 (Table 2-5).

    Table 2-5 NHTS and HTS Income Group

    NHTS Household income Group

    HTS Household income Group

    -7=Refused

    99 Prefer not to answer

    01=<$5000

    02=$5,000-$9,999

    03=$10,000-$14,999

    04=$15,000-$19,999

    05=$20,000-$24,999

    06=$25,000-$29,999

    1 - Under $30,000

    07=$30,000-$34,999

    08=$35,000-$39,999

    09=$40,000-$44,999

    10=$45,000-$49,999

    11=$50,000-$54,999

    12=$55,000-$59,999

    2 - $30,000-$59,999

    13=$60,000-$64,999

    14=$65,000-$69,999

    15=$70,000-$74,999

    16=$75,000-$79,999

    17=$80,000-$99,999

    3 - $60,000-$99,999

    18=>=$100000

    4 - $100,000-$149,999

    5 - $150,000 or more

Similarly, the NHTS data was used as a reference, and a Monte Carlo simulation process imputed the three HTS income groups. There are 2112 California household samples in the NHTS. The output probability distribution by distance, number of persons per household, and tour mode are available in hhincome share.csv. The income share calculation procedure was stored in Sheet income of Hh_df_ori.xlsx.

  • HBHUR contains five values derived from Area_Type in areatype_mgra.csv:

    • C=Second City
    • R=Rural
    • S=Suburban
    • T=Town
    • U=Urban
  • Note that Suburban and Town refer to the same group in HTS, and NHTS HBHUR does not include Town (T). NHTS and HTS area types are listed in Table 2-6.

    Table 2-6 NHTS and HTS Area Type

    HTS Area_Type

    HTS Loc_Type

    NHTS HBHUR

    Urban

    Urban

    U

    National / State / Regional Park

    Rural

    R

    Second City

    Urban

    C

    Town Center

    Urban

    U

    Suburban

    Town

    S

    Major University

    Urban

    U

    Rural

    Rural

    R

    Military

    Rural

    R

    Central Business District

    Urban

    U

    Tourist Attraction

    Rural

    R

    San Diego Bay

    Rural

    R

    Port of Entry

    Rural

    R

    Tribal

    Rural

    R

    Village

    Rural

    R

  • A Monte Carlo simulation process was applied to HOMETYPE to split HTS res_type=2-Townhouse (attached house) into Hh_df 2=Duplex and 3=Rowhouse or townhouse.

  • The definition of NHTS Life cycle (LIF_CYC) is in Sheet1 of Hh_df_ori.xlsx, rows 116-127. Two approximations are applied:

    • Since the minimum adult age is 18, the youngest child 16-21 was modified to 16-17 to avoid conflict.
    • Retirement is defined as age>=18 and EMPLY==4.
  • The person, drive, and work counts by age group (AGE_PX, DRV_PX, WKR-PX) are acquired from variables R_AGE, DRIVER, and WORKER in Per_df.

  • The seven San Diego MAS codes are obtained from T:\ABM\ABM_FY22\RSM\VisionEval\Model\Update_Automations\SX\mgra13_msacat.csv.

  • Race was consolidated from the eight individual attributes in the HTS person data and hardcoded in per_debug.csv as HHR_RACE, which needs to be QC'd as well. Table 2-7 lists the NHTS and HTS race definitions:

    Table 2-7 NHTS and HTS Race

    NHTS Race

    HTS Race

    -7=Refused

    ethnicity_prefernot

    01=White

    ethnicity_white

    02=African American, Black

    ethnicity_black

    03=Asian Only

    ethnicity_asian

    04=American Indian, Alaskan Native

    ethnicity_amindian_alaska

    05=Native Hawaiian, other Pacific Islander

    ethnicity_hawaiian_pacific

    06=Hispanic/Mexican Only

    ethnicity_hispanic

    17=Other specify

    ethnicity_other

  • HH respondents are approximated as the first person (perid=1), which are all adults, from the HTS person dataset.

  • For simplicity, URBAN is approximated as 2 and 4 only. However, this attribute seems to not be used in VERSPM. Table 2-8 lists the NHTS and HTS urban type:

    Table 2-8 NHTS and HTS Urban

    NHTS URBAN

    HTS URBAN

    1=In an Urban cluster

    -

    2=In an urban area

    2 - Urban

    3=In an area surrounded by urban areas

    -

    4=Not in urban area

    4- Rural

Dt_df.rda

In VERSPM, the raw data Dt_df.rda was prepared to create a data frame of tours by households, if not already created. In the SDRSPM, the ToursByHh_df.rda already existed. As a result, Dt_df.rda, containing HOUSEID, PERSONID, and VEHID, was imputed with placeholders to ensure the SDRSPM can be built.

Go To Top