Different outputs when using MultiProcessing #150

jamiecook · 2022-01-28T07:13:02Z

@binnympaul not sure if this is the right place to post this but i'm seeing strange differences in my outputs when using multiprocessing.

I've setup a run that only processes 2 SA3s (Australian bureau of statistics areas - ~50k population)

If I use num_processes=1 or set multiprocess: False I get outputs that match well to controls

But when i use mp=2 it seems to get corrupted, only one of the two SA3s is generated and the persons/households within that SA3 are not quite right (10% high on num people, households are good but sometimes bad - see image below)

Do you have any thoughts on this.

PS is @bstabler still contributing here?

CONFIG FILE

####################################################################
# PopulationSim Properties
####################################################################


# Algorithm/Software Configuration
# ------------------------------------------------------------------
INTEGERIZE_WITH_BACKSTOPPED_CONTROLS: True
SUB_BALANCE_WITH_FLOAT_SEED_WEIGHTS: False
GROUP_BY_INCIDENCE_SIGNATURE: False
USE_SIMUL_INTEGERIZER: True
USE_CVXPY: False
max_expansion_factor: 50
MAX_BALANCE_ITERATIONS_SEQUENTIAL: 100000


# Geographic Settings
# ------------------------------------------------------------------
geographies: [Region, SA3, SA1]
seed_geography: Region
  
# Data Directory
# ------------------------------------------------------------------
data_dir: data

# Input Data Tables
# ------------------------------------------------------------------
# input_pre_processor input_table_list
input_table_list:
- tablename: households
  filename : seed_households.csv
  index_col: household_id
  rename_columns:
    hhnum: household_id
- tablename: persons
  filename : seed_persons.csv
  rename_columns:
    hhnum: household_id
    SPORDER: person_id
- tablename: geo_cross_walk
  filename : geo_cross_walk.csv
- tablename: SA1_control_data
  filename : control_totals_SA1.csv
- tablename: SA3_control_data
  filename : control_totals_SA3.csv
- tablename: Region_control_data
  filename : control_totals_Region.csv

# Reserved Column Names
# ------------------------------------------------------------------
household_weight_col: weight
household_id_col: household_id
total_hh_control: total_Total_households

# Control Specification File Name
# ------------------------------------------------------------------
control_file_name: controls_interim2_2.with_income.csv

# Output Tables
# ------------------------------------------------------------------
# output_tables can specify either a list of output tables to include or to skip
# if neither is specified, then no tables will be written

output_tables:
  action: include
  tables:
    - summary_SA1

# Synthetic Population Output Specification
# ------------------------------------------------------------------
#

output_synthetic_population:
  household_id: household_id
  households:
    filename: synthetic_households.csv
    columns:
      - household_structure
      - dwelling_type
      - hh_size
      - num_cars
      - num_child_0_4
      - num_child_5_12
      - num_child_13_17
      - hh_inc_band
  persons:
    filename: synthetic_persons.csv
    columns:
      - person_id
      - gender
      - age
      - anzsic_1
      - anzsco_1
      - student_type
      - study_stat
      - primary_status
      - income

models:
  - input_pre_processor
  - setup_data_structures
  - initial_seed_balancing
  - meta_control_factoring
  - final_seed_balancing
  - integerize_final_seed_weights
  - sub_balancing.geography=SA3
  - sub_balancing.geography=SA1
  - expand_households
  - write_tables
  - write_synthetic_population
  - summarize
  
slice_geography: SA3
multiprocess: True
multiprocess_steps:
  - name: mp_seed_balancing
    begin: input_pre_processor
  - name: mp_sub_balancing_SA3
    begin: sub_balancing.geography=SA3
    num_processes: 2
    slice:
      tables:
        - slice_crosswalk
        - crosswalk
      # don't slice any tables not explicitly listed above in slice.tables
      except: True
      # the following tables are added by sub_balancer and should be coalesced
      coalesce:
        - SA3_weights
        - SA3_weights_sparse
        - trace_SA3_weights
  - name: mp_summarize
    begin: expand_households

The text was updated successfully, but these errors were encountered:

bstabler · 2022-01-29T00:00:44Z

Hi @jamiecook. I'm no longer actively supporting this project, but when we added the multiprocessing component, we did several comparisons / validation for the Oregon statewide model implementation. This was for @bettinardi at ODOT and was done by @goreaditya at RSG. Maybe @bettinardi can help investigate?

jfdman · 2022-01-31T22:24:30Z

Thanks @bstabler ! Its my understanding that our tests showed exactly the same results? There's nothing that stands out in your configuration file that seems problematic. @goreaditya or @bettinardi - any ideas?

jamiecook · 2022-02-02T08:37:00Z

Versions I'm using

λ pip list | grep sim
activitysim                       1.0.4
populationsim                     0.5.1

I've set up a tar ball here with the simple test I'm running. The first one works correctly as the run.py disables MP, the second one removes that line and generates the strange output.

github_issue_mp=1.tar.gz
github_issue_mp=2.tar.gz

The easiest way to see the differnce is to count the persons by their SA3.

☢ cut -d, -f2 github_issue_mp\=1/output/synthetic_persons.csv | sort | uniq -c
  57958 30204
  45277 30402
      1 SA3
20220202 18:36:17   jamie@hikaru:/mnt/hdd_data/jamie_data/move2.0/runs/InterimResults/population_synthesis/domestic/processing
λ cut -d, -f2 github_issue_mp\=2/output/synthetic_persons.csv | sort | uniq -c
  63870 30204
      1 SA3

bettinardi · 2022-02-02T14:41:27Z

I'm hoping @goreaditya can weigh-in. I have reviewed overall results at a higher level of than this discussion and do not have anything immediate to contribute to this issue. I am thankful that @jamiecook is flagging this and hope that we can find the issuse(s) if they exist and have a cleaner product if there is a bug here.

jamiecook · 2022-02-09T22:42:35Z

Any update on this issue? At the moment I'm pushing ahead by wrapping my own multiprocess Pool around multiple calls to activitysim.cli.run - but that seems less than ideal in the long run.

janzill · 2022-10-05T00:38:33Z

I am using PopulationSim on a different project, but with the same geographies (household travel survey at Region level, controls at SA3 and SA1, where SA1 is the smallest level). If you do the multiprocessing like the test example, i.e. only parallelise over the lowest level (TAZ there, here SA1), then the results look correct for me.

In terms of mp settings, the last 20 lines of the yaml Jamie attached would then read

slice_geography: SA3
multiprocess: True
multiprocess_steps:
- name: mp_seed_balancing
begin: input_pre_processor
- name: mp_sub_balancing_SA1
begin: sub_balancing.geography=SA1
num_processes: 2
slice:
tables:
- slice_crosswalk
- crosswalk
# don't slice any tables not explicitly listed above in slice.tables
except: True
# the following tables are added by sub_balancer and should be coalesced
coalesce:
- SA1_weights
- SA1_weights_sparse
- trace_SA1_weights
- name: mp_summarize
begin: expand_households

Also, @jamiecook is no longer working on this project, do you have any further updates on this Matt (sorry for the link, cannot tag m-richards but sent him a message)?

m-richards · 2022-10-05T02:28:43Z

@janzill Thanks for checking (for context, I have picked up the work Jamie was doing using populationsim to produce the above outputs) the code is now at a point where I haven't been able to replicate the problem documented in this issue.

I'm seeing reasonable, comparable results using both manual multiprocessing pool and running multiprocessing at the SA1 (smallest geography) level.

jamiecook · 2022-10-06T06:43:54Z

So ... was this a Jamie problem all along? Or was anyone else actually able to reproduce the example that I uploaded?

bettinardi · 2023-10-06T16:20:40Z

Could we have this reviewed and finalized (either closed as not an issue, or resolved if there is an issue, or if the bug is large, having a clear issues established on what it will take to fix) - under Phase 9.

bettinardi added this to the Phase 9 Priorities milestone Oct 6, 2023

bettinardi added the bug label Oct 6, 2023

xiex0055 mentioned this issue Nov 9, 2023

Randomness of PopulationSim outputs related to API calls #182

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different outputs when using MultiProcessing #150

Different outputs when using MultiProcessing #150

jamiecook commented Jan 28, 2022 •

edited

Loading

bstabler commented Jan 29, 2022

jfdman commented Jan 31, 2022

jamiecook commented Feb 2, 2022

bettinardi commented Feb 2, 2022

jamiecook commented Feb 9, 2022

janzill commented Oct 5, 2022 •

edited

Loading

m-richards commented Oct 5, 2022

jamiecook commented Oct 6, 2022

bettinardi commented Oct 6, 2023

Different outputs when using MultiProcessing #150

Different outputs when using MultiProcessing #150

Comments

jamiecook commented Jan 28, 2022 • edited Loading

bstabler commented Jan 29, 2022

jfdman commented Jan 31, 2022

jamiecook commented Feb 2, 2022

bettinardi commented Feb 2, 2022

jamiecook commented Feb 9, 2022

janzill commented Oct 5, 2022 • edited Loading

m-richards commented Oct 5, 2022

jamiecook commented Oct 6, 2022

bettinardi commented Oct 6, 2023

jamiecook commented Jan 28, 2022 •

edited

Loading

janzill commented Oct 5, 2022 •

edited

Loading