Skip to content

Add EPMC residual allocation and make allocation method explicit#384

Open
jpvelez wants to merge 20 commits intomainfrom
add-epmc-residual-allocation
Open

Add EPMC residual allocation and make allocation method explicit#384
jpvelez wants to merge 20 commits intomainfrom
add-epmc-residual-allocation

Conversation

@jpvelez
Copy link
Copy Markdown
Contributor

@jpvelez jpvelez commented Mar 26, 2026

Summary

  • Adds equi-proportional marginal cost (EPMC) residual allocation to CAIRO via monkey-patching
  • Disables broken peak residual allocation
  • Makes build_master_bat.py detect available BAT metrics at runtime
  • Separates delivery and supply allocation methods in the subclass RR YAML — each run picks its delivery and supply allocation independently
  • Fixes a subtle bug where EPMC supply RR was inflated by $16M due to mismatched EB weights between delivery-only and delivery+supply runs
  • Adds heating_type_v2 column (heat_pump, electrical_resistance, natgas, delivered_fuels, other) with informational breakdown in the RR YAML
  • create_scenario_yamls.py reads run_includes_subclasses, residual_allocation_delivery, and residual_allocation_supply from the Google Sheet

The supply subtraction bug (fixed in this PR)

When EPMC was used for both delivery and supply, the supply RR was derived as total_RR(run 2 EPMC) - delivery_RR(run 1 EPMC). But EPMC weights differ between runs (HP = 2.67% of delivery EB vs 4.83% of combined EB). The subtraction produced a phantom $15.9M supply residual for HP instead of the correct ~$6.4M. The fix: separate delivery and supply into independent YAML blocks with their own allocation methods.

New YAML structure

subclass_revenue_requirements:
  delivery:
    passthrough: {hp: ..., non-hp: ...}    # no cross-subsidy correction
    percustomer: {hp: ..., non-hp: ...}    # corrects MC + residual
    epmc: {hp: ..., non-hp: ...}           # allocates residual by MC share
    volumetric: {hp: ..., non-hp: ...}     # allocates residual by kWh
  supply:
    passthrough: {hp: ..., non-hp: ...}    # same supply rate as default
    percustomer: {hp: ..., non-hp: ...}    # corrects supply MC + residual
    volumetric: {hp: ..., non-hp: ...}     # corrects supply MC, not residual

Each scenario run picks independently:

residual_allocation_delivery: epmc        # or percustomer
residual_allocation_supply: passthrough   # or percustomer, volumetric

Verification steps after running RI batch

Prerequisites

  1. Update Google Sheet (Runs & Charts):

    • Replace residual_allocation column with residual_allocation_delivery and residual_allocation_supply
    • RI runs 5-6: delivery=percustomer, supply=passthrough
    • RI runs 9-10, 13-14: delivery=percustomer, supply=percustomer
    • RI runs 17-18: delivery=epmc, supply=passthrough
    • All other runs: leave both blank
  2. On server: git pull, then just s ri compute-rev-requirements

  3. Verify YAML structure:

    cat ri/config/rev_requirement/rie_hp_vs_nonhp.yaml
    • Should have delivery: and supply: blocks
    • delivery.epmc.hp should be ~$14.5M
    • supply.passthrough.hp should be ~$21.1M (NOT $31.6M)
  4. Delete old run 17-20 S3 outputs, re-run 17-20, rebuild master bills+bat

Verification checks

import polars as pl
from data.eia.hourly_loads.eia_region_config import get_aws_storage_options
opts = get_aws_storage_options()

S3_BASE = "s3://data.sb/switchbox/cairo/outputs/hp_rates/ri/all_utilities"
OLD = "ri_20260324_r1-20_fixedcharges_hpflat"
NEW = "ri_20260326_r1-20_epmc"
JOIN_KEYS_BAT = ["bldg_id", "sb.electric_utility"]
JOIN_KEYS_BILLS = ["bldg_id", "sb.electric_utility", "month"]

# CHECK 1: BAT_epmc present, BAT_peak absent, LMI cols in bills (all pairs)
for pair in ["1+2", "3+4", "5+6", "7+8", "9+10", "11+12", "13+14", "15+16", "17+18", "19+20"]:
    bat = pl.read_parquet(f"{S3_BASE}/{NEW}/run_{pair}/cross_subsidization_BAT_values/", storage_options=opts)
    bills = pl.read_parquet(f"{S3_BASE}/{NEW}/run_{pair}/comb_bills_year_target/", storage_options=opts)
    assert "BAT_epmc_delivery" in bat.columns
    assert "BAT_peak_delivery" not in bat.columns
    assert "elec_total_bill_lmi_32" in bills.columns

# CHECK 2: Runs 1-8 BAT unchanged vs old batch
for pair in ["1+2", "3+4", "5+6", "7+8"]:
    old = pl.read_parquet(f"{S3_BASE}/{OLD}/run_{pair}/cross_subsidization_BAT_values/", storage_options=opts)
    new = pl.read_parquet(f"{S3_BASE}/{NEW}/run_{pair}/cross_subsidization_BAT_values/", storage_options=opts)
    shared = [c for c in old.columns if c in new.columns and c not in JOIN_KEYS_BAT and old[c].dtype.is_numeric()]
    diff = old.join(new, on=JOIN_KEYS_BAT, suffix="_new")
    for col in shared:
        assert (diff[col] - diff[f"{col}_new"]).abs().max() < 1e-6, f"run_{pair} {col} changed"

# CHECK 3: Runs 17+18 changed — electric bills differ, gas unchanged
old_b17 = pl.read_parquet(f"{S3_BASE}/{OLD}/run_17+18/comb_bills_year_target/", storage_options=opts)
new_b17 = pl.read_parquet(f"{S3_BASE}/{NEW}/run_17+18/comb_bills_year_target/", storage_options=opts)
diff_b17 = old_b17.join(new_b17, on=JOIN_KEYS_BILLS, suffix="_new")
assert (diff_b17["elec_delivery_bill"] - diff_b17["elec_delivery_bill_new"]).abs().max() > 1.0  # changed
assert (diff_b17["gas_total_bill"] - diff_b17["gas_total_bill_new"]).abs().max() < 1e-6  # unchanged

# CHECK 4: Supply portion is now neutral — HP supply bill under HP flat ≈ HP supply bill under default
bat_12 = pl.read_parquet(f"{S3_BASE}/{NEW}/run_1+2/cross_subsidization_BAT_values/", storage_options=opts)
bat_1718 = pl.read_parquet(f"{S3_BASE}/{NEW}/run_17+18/cross_subsidization_BAT_values/", storage_options=opts)
# HP supply bills should be very similar between default and HP flat
hp_12 = bat_12.filter(pl.col("postprocess_group.has_hp") == True)
hp_1718 = bat_1718.filter(pl.col("postprocess_group.has_hp") == True)
default_supply = (hp_12["annual_bill_supply"] * hp_12["weight"]).sum()
hpflat_supply = (hp_1718["annual_bill_supply"] * hp_1718["weight"]).sum()
print(f"HP supply bills: default=${default_supply:,.0f}, HP flat=${hpflat_supply:,.0f}")
# These should be close (supply is pass-through)

Expected bill delta chart behavior

The HP flat rate bar should be mostly green (HP flat delivery is a big discount from default). It should be slightly less green than the old per-customer batch because EPMC delivery RR is ~$3M higher. Supply should be neutral (same rate as default).

jpvelez added 20 commits March 26, 2026 16:45
- Add equi-proportional marginal cost (EPMC) residual allocation to CAIRO
  via monkey-patches in utils/mid/patches.py. EPMC allocates residual costs
  in proportion to each customer's economic burden (R_i = R * EB_i / sum(EB_j * w_j)),
  equivalent to scaling all MC-based rates by K = TRR / MC_Revenue.

- Disable broken peak residual allocation to free up compute.

- Make build_master_bat.py detect available BAT metrics at runtime instead of
  hardcoding — gracefully handles missing metrics (logs and skips).

- Refactor compute_subclass_rr to compute both per-customer and EPMC subclass
  RRs in a single pass. Revenue requirement YAML now uses nested format keyed
  by allocation method (percustomer, epmc).

- Add required residual_allocation field to scenario YAML for subclass runs.
  Parser raises ValueError if field is missing or allocation method not found
  in the YAML. No defaults — every subclass run must be explicit.

- Wire RI runs 17-18 to use EPMC allocation; runs 5-14 explicitly tagged
  as percustomer. NY runs 5-14 also tagged percustomer.

- Migrate all *_hp_vs_nonhp.yaml files (RI + NY) to nested format.

Made-with: Cursor
- create_scenario_yamls.py: read run_includes_subclasses directly from the
  Google Sheet column instead of re-deriving from tariff key count. Add
  residual_allocation column handling (optional-if-non-empty pattern).

- Tests updated: fixture adds run_includes_subclasses column, derivation
  test replaced with sheet-read test, new test for residual_allocation.

Made-with: Cursor
Makes the disabled peak allocator visible in the code for future reference.

Made-with: Cursor
…epmc

The Justfile recipe and its callers explicitly passed "BAT_percustomer",
overriding the Python function's default. This caused the RR YAML to only
contain the percustomer block, making run 17 crash with "epmc not found".

Made-with: Cursor
The Python CLI default (BAT_percustomer,BAT_epmc) always computes both
allocation methods. No reason for the Justfile layer to pass or override it.

Made-with: Cursor
- identify_heating_type.py: add postprocess_group.heating_type_v2 with
  five categories (heat_pump, electrical_resistance, natgas,
  delivered_fuels, other) alongside the existing heating_type column.

- run_scenario.py: include heating_type_v2 in metadata columns so it
  flows through to CAIRO's customer_metadata.csv.

- compute_subclass_rr: compute an informational 5-way breakdown by
  heating_type_v2 and write it under heating_type_breakdown in the
  RR YAML. Gracefully skips if the column doesn't exist in metadata.

Made-with: Cursor
Runs build-master-bills-with-lmi + build-master-bat for all 10 run pairs
(1+2 through 19+20) in one command.

Usage: just s ri build-all-master <batch>
Made-with: Cursor
Runs build-master-bills-with-lmi + build-master-bat for all 8 NY run pairs
(1+2 through 15+16) in one command.

Usage: just s ny build-all-master <batch>
Made-with: Cursor
The run 1/run 2 subtraction architecture produces incorrect supply RRs
when EPMC is used, because EB weights differ between runs (HP = 2.67% in
delivery-only run vs 4.83% in delivery+supply). This created a phantom
$15.9M "supply EPMC residual" that inflated HP total RR by $16M.

Fix: restructure the YAML into independent delivery and supply blocks.
Each run picks one delivery method and one supply method independently.

Delivery methods: passthrough, percustomer, epmc, volumetric
Supply methods: passthrough, percustomer, volumetric
(Supply EPMC omitted — broken by subtraction; volumetric gives same answer)

Supply passthrough = actual supply bills (no BAT adjustment) = same
supply rate as default. This is correct for delivery-only rate designs
(seasonal, flat HP) where supply shouldn't change.

Supply percustomer/volumetric = BAT-adjusted supply (clean subtraction,
weights are constant). Correct for TOU runs that address supply cross-sub.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant