Skip to content
Michael Reilly edited this page Oct 12, 2018 · 13 revisions

This is to improve the two residential hedonic models: rsh and rrh. We want to bring in one or two logsum-based measures to represent accessibility. The new measures to test are cml15 (Combined Mandatory Logsum 2015) and cnml15 (Combined Non-Mandatory Logsum 2015). It is probably best to remove a number of variables and then try a model with only cml15 or both cnml15. The variables to remove are: bart1, bart2, bart3a, lrt1, embarcadero, jobs_45, and stanford. With that simpler model, cml15 should be positive. Ideally, cnml15 would also be positive (but it they two are fairly correlated so it may not work). The try re-entering some of the variables taken out: bart1 and bart2.

Computing the Combined Logsum

  • data for logsums for HH segments defined by income and auto-sufficiency are at (in the logsums branch) data/2015_06_002_mandatoryAccessibilities.csv and data/2015_06_002_nonMandatoryAccessibilities.csv
  • all of the logsum columns in the first table are scaled, divided by in-vehicle time coefficients, and combined based on that HH type's regional share to create a single weighted metric for mandatory accessibility; the same calculations (but with a different in-vehicle time) is used with the second table to create a single non-mandatory accessibility
  • use 0.0134 for the mandatory in-vehicle time coefficient and use 0.0175 for the non-mandatory in-vehicle time coefficient
  • We may want to try different subsets of the columns (your might expect wealthier more car-rich HHs to drive sales prices more than rent and vice versa) and we may want to combine the two metrics in some way (to eliminate correlation)
  1. More specifically: create a new variable for each column by taking each value minus the the minimum value in that column AND dividing that new value by the in-vehicle time coefficient. A one column example in R: ls10_mandatory$lowInc_0_autos_t <- (ls10_mandatory$lowInc_0_autos - min(ls10_mandatory$lowInc_0_autos)) / 0.0134
  2. Do a weighted sum of all the transformed columns in the table using the hh weights from data/accessibilities_segmentation.csv . For example in R: ls10_mandatory$cml15 <- 0.061 * ls10_mandatory$lowInc_0_autos_t + 0.005 * ls10_mandatory$lowInc_autos_lt_workers_t + 0.211 * ls10_mandatory$lowInc_autos_ge_workers_t + 0.016 * ls10_mandatory$medInc_0_autos_t + 0.013 * ls10_mandatory$medInc_autos_lt_workers_t + 0.204 * ls10_mandatory$medInc_autos_ge_workers_t + 0.009 * ls10_mandatory$highInc_0_autos_t + 0.020 * ls10_mandatory$highInc_autos_lt_workers_t + 0.208 * ls10_mandatory$highInc_autos_ge_workers_t + 0.007 * ls10_mandatory$veryHighInc_0_autos_t + 0.026 * ls10_mandatory$veryHighInc_autos_lt_workers_t + 0.220 * ls10_mandatory$veryHighInc_autos_ge_workers_t

Joining

These values are related to zones defined in a new map: ** This should be used to apply these values to individual parcels/buildings

Forecasting

We want to build the hedonic with with year 2015 logsums. We will will want to use other logsums for various forecast years. So use the 2015 coefficients along with updated CML15 and CNML15 values to model future year home prices. We would want to specify which logsum will be use in which year in the settings.yaml in a totally flexible manner like:

  • 2015 2015_06_02
  • 2020 2015_06_02
  • 2025 2030_03_01
  • 2030 2030_03_01
  • 2035 2030_03_01
  • 2040 2045_07_04
  • 2045 2045_07_04
  • 2050 2045_07_04
  • We would also like to get the segmentation variables for each future year from the segmentation table above. All rows are identical now but they will change.
Clone this wiki locally