Skip to content

Auto-validate state transitions; make log_level the runtime-validation policy#360

Open
hmgaudecker wants to merge 14 commits into
refactor/phase-1-validation-cleanupfrom
feat/phase-1b-auto-state-transition-validation
Open

Auto-validate state transitions; make log_level the runtime-validation policy#360
hmgaudecker wants to merge 14 commits into
refactor/phase-1-validation-cleanupfrom
feat/phase-1b-auto-state-transition-validation

Conversation

@hmgaudecker
Copy link
Copy Markdown
Member

@hmgaudecker hmgaudecker commented May 18, 2026

Stacked on top of #359. Merge that one first.

Summary

pylcm did almost all validation checks internally, except for exposing validate_transition_probs and leaving it to the user to validate state transition probs, whereas we had an internal mechanism for regime transition probs. Furthermore, which validation was done was dependent on a rather obscure / ad-hoc mix of settings.

This PR unifies validation and control over it by adding runtime-validation of state transition probabilities and controlling all costly validation via log_level (off < warning < progress < debug) and log_path:

log_level log_path Runtime validation Console output Snapshots to disk
"off" (ignored) not run silent none
"warning" None runs → failures warn warnings none
"warning" set runs → failures warn warnings one per warned failure, capped at log_keep_n_latest
"progress" None runs → failures warn warnings + timing none
"progress" set runs → failures warn warnings + timing one per warned failure, capped at log_keep_n_latest
"debug" None runs → failures raise warnings + timing + V_arr stats none
"debug" set runs → failures raise warnings + timing + V_arr stats one per solve and on raise, capped at log_keep_n_latest

validate_transition_probs is removed from the public API.

log_level governs only the costly runtime numerical validation — the transition-probability sweeps, the initial-conditions check, and the NaN/Inf check on the value-function arrays. The cheap construction-time sanity checks (regime/model structure, grid definitions, function signatures, the probs_array subscript-order check) always run when a Regime or Model is built, regardless of log_level.

log_level is a required argument

solve() and simulate() no longer default log_level — the caller picks it deliberately. Start every project at "debug" (fail early, gather full diagnostics) and ease to "warning" / "off" only once the model is trusted and the run needs the speed or the non-raising behaviour. A loose default would hide that "debug" exists; a "debug" default would make pylcm look slow.

simulate()'s check_initial_conditions flag is removed too: initial-conditions validation now follows the same log_level policy as every other runtime check.

Runtime validation

log_level is the single knob. The logger it produces carries the policy — there is no separate validation_mode value threaded around the engine. It governs four checks, run before backward induction:

  • State transition probabilities — every MarkovTransition state transition is swept over the regime's grid; outcome-axis size, [0, 1] range, and sum-to-1 are checked.
  • Regime transition probabilities — finiteness, [0, 1] range, sum-to-1, no probability mass to inactive regimes, and no mass to targets with incomplete stochastic transitions.
  • Initial conditions (simulate() only) — states on-grid, regime IDs valid, at least one feasible action combination per subject.
  • Value function — NaN/Inf check on each period's V_arr, with the offending (regime, period) localised.

"off" skips all four; "warning" / "progress" log a warning and let the run continue (the returned solution may carry NaN); "debug" raises on the first failure.

Test plan

  • pixi run -e tests-cpu tests
  • pixi run ty
  • prek run --all-files

🤖 Generated with Claude Code

Validate state transition probability functions automatically — both
statically at process time and numerically at solve time — so users no
longer need to call `lcm.validate_transition_probs` manually for state
transitions. Plan: `Phase 1b — Automatic State Transition Validation.md`.

What runs when:
- **Process time** (during `process_regimes`, always on, cheap):
  AST subscript-order check on every `MarkovTransition.func` —
  permissive: skipped when the function doesn't use the
  `probs_array[...]` pattern. Outcome-axis size is derived from the
  state's `DiscreteGrid` and cached on the canonical `Regime` via the
  new `stochastic_state_transitions` field. For per-target dicts, the
  target regime's grid wins (cross-grid state spaces).
- **Solve / simulate time** (gated by `log_level != "off"`):
  new `validate_state_transitions_all_periods` evaluates each
  `MarkovTransition` function on the Cartesian product of the
  function's accepted grid args (via vmap) and checks outcome-axis
  size, [0, 1] range, and sum-to-1 along the last axis. Raises a new
  `InvalidStateTransitionProbabilitiesError` on failure.

Fast-exits when no regime has any `MarkovTransition` state transition.

The slimmed `lcm.validate_transition_probs` (Phase 1) is deprecated
with a `DeprecationWarning` pointing at the automatic validator. It
will be removed in a subsequent phase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented May 18, 2026

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Benchmark comparison (main → HEAD)

Comparing 629ac442 (main) → d8131c87 (HEAD)

Benchmark Statistic before after Ratio Alert
aca-baseline execution time 26.281 s 15.222 s 0.58
peak GPU mem 639 MB 4.14 GB 6.48
compilation time 297.53 s 288.40 s 0.97
peak CPU mem 7.40 GB 7.26 GB 0.98
Mahler-Yum execution time 4.679 s 4.312 s 0.92
peak GPU mem 529 MB 529 MB 1.00
compilation time 14.15 s 12.81 s 0.91
peak CPU mem 1.71 GB 1.69 GB 0.99
Precautionary Savings - Solve execution time 46.3 ms 25.7 ms 0.55
peak GPU mem 101 MB 101 MB 1.00
compilation time 2.66 s 2.20 s 0.83
peak CPU mem 1.14 GB 1.12 GB 0.98
Precautionary Savings - Simulate execution time 120.2 ms 95.9 ms 0.80
peak GPU mem 349 MB 349 MB 1.00
compilation time 5.01 s 5.08 s 1.01
peak CPU mem 1.34 GB 1.32 GB 0.99
Precautionary Savings - Solve & Simulate execution time 159.7 ms 129.3 ms 0.81
peak GPU mem 586 MB 586 MB 1.00
compilation time 7.22 s 6.71 s 0.93
peak CPU mem 1.30 GB 1.28 GB 0.99
Precautionary Savings - Solve & Simulate (irreg) execution time 288.0 ms 275.6 ms 0.96
peak GPU mem 2.20 GB 2.20 GB 1.00
compilation time 7.53 s 6.94 s 0.92
peak CPU mem 1.36 GB 1.34 GB 0.99

@hmgaudecker hmgaudecker changed the title Phase 1b: auto-validate stochastic state transition probabilities Auto-validate stochastic state transition probabilities May 19, 2026
hmgaudecker and others added 3 commits May 19, 2026 14:46
…checks.py

phase-1 split runtime_checks.py into solution/validate_V.py and
lcm/_transition_checks.py. phase-1b added a third runtime family (state
transition probability validation) to the old runtime_checks.py. After
this merge, that family lives in lcm/_transition_checks.py beside the
regime-prob family.

Resolution details:
- runtime_checks.py: take the phase-1 deletion; state-prob functions land
  in _transition_checks.py alongside the regime-prob family.
- model.py: import both validate_regime_transitions_all_periods and
  validate_state_transitions_all_periods from lcm._transition_checks.
- Docstrings in exceptions.py, interfaces.py, regime_building/static_checks.py,
  and user_regime.py updated to reference lcm/_transition_checks.py.
- Test file renamed: tests/regime_building/test_state_transition_validation.py
  → tests/test_transition_checks.py (the source it covers is no longer in
  regime_building/).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-auto-state-transition-validation

# Conflicts:
#	tests/test_regime.py
Resolves the PR review's main finding: `log_level="off"` silently
disabled state-transition validation while regime-transition validation
still ran unconditionally — an asymmetric footgun.

`log_level` now governs all runtime validation uniformly:

- `"off"` — validation does not run.
- `"warning"` / `"progress"` — validation runs; failures are logged as
  warnings and the run continues.
- `"debug"` — validation runs and raises on the first failure.

The default `log_level` moves from `"progress"` to `"debug"`, so the
default `solve()` / `simulate()` validates and raises (secure default).
The mode applies to state-transition checks, regime-transition checks
(`validate_regime_transitions_all_periods` — previously unconditional),
and the `validate_V` NaN check.

`debug` no longer requires `log_path`: the `_validate_log_args` rule is
removed (it would make the new default unusable without a path).
`log_path` is optional everywhere; snapshots are written only when set.

Warn-mode disk safety: in warn mode an invalid model keeps running, so a
diagnostic snapshot is written on each warned NaN failure (when log_path
is set), retention-capped at `log_keep_n_latest`. `_enforce_retention`
now orders snapshot directories by parsed integer counter rather than
lexically, so retention stays correct past 999 iterations.

Review fixes:

- `_check_subscript_order` runs after the `DiscreteGrid` guard, so a
  continuous-state `MarkovTransition` no longer gets a spurious
  process-time raise.
- `_find_state_grid` returns `None` for a per-target dict whose target
  lacks the state, rather than sizing `n_outcomes` off the source grid.
- `_validate_state_transition_single` warns instead of silently
  skipping a transition with an unrecognized parameter.
- Docstrings drop "now" history wording, the rST `.. deprecated::`
  directive, and hard-coded internal module paths.

Tests: a hidden-invalidity test (valid at some grid points, invalid at
others, swept via the continuous grid), warn/raise-per-level coverage,
and a parametrized check pinning the `log_level` x `log_path` snapshot
table. Docs updated with the full behaviour table.
@hmgaudecker hmgaudecker changed the title Auto-validate stochastic state transition probabilities Auto-validate state transitions; make log_level the runtime-validation policy May 20, 2026
State and regime transition probabilities are validated automatically
during solve()/simulate(), gated by log_level. The standalone
validate_transition_probs entry point and its helpers are redundant, so
drop them along with their tests and doc references.

Also trim a redundant diagnostics_enabled guard in solve_brute.py: the
"raise" validation mode already implies diagnostics are enabled.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

hmgaudecker and others added 2 commits May 20, 2026 11:50
The per-period NaN/Inf tracking in solve() exists to feed runtime
validation. Gating it on logger.isEnabledFor(WARNING) duplicated the
log-level partition that validation_mode already encodes. Derive the
gate from validation_mode != "off" so its source matches its purpose;
behaviour is unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Re-array notebook cell sources in stochastic_transitions.ipynb so each
  line is its own JSON element (one-string sources produce noisy diffs).
- Drop the stale per-level table from debugging.md; it duplicated and
  had drifted from the canonical log_level x log_path table in
  solving_and_simulating.md, which debugging.md already links to.
- Trim "per-period timing" to "timing" in the behaviour table.
- Document the notebook cell-formatting check in AGENTS.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
hmgaudecker and others added 7 commits May 20, 2026 13:52
The file tests the user-facing `Regime`; `user_regime` disambiguates it
from the canonical `Regime` and the `regime_building` / regime-template
modules.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`derive_stochastic_state_transitions` becomes
`collect_stochastic_state_transitions`, mirroring `collect_state_transitions`
(the structurally identical walk over `state_transitions`). Both collectors
now live in `regime_building/transitions.py`; `static_checks.py` is removed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The validation policy (off / warn / raise) was threaded as a separate
`validation_mode` argument alongside `logger` through solve(), both
transition validators, and _solve_compiled — carrying the same
information twice, since both derive from `log_level`.

The logger is now the single source of truth. Two named predicates,
`validation_enabled()` and `validation_raises()`, read the policy off
the logger's level; `raise_or_warn()` drops its `mode` parameter.
`ValidationMode`, `_VALIDATION_MODE_MAP`, and `get_validation_mode` are
removed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`log_level` no longer defaults to `"debug"`. Forcing the caller to pass
it makes the choice deliberate: start at `"debug"` (fail early, full
diagnostics) and ease to `"warning"` / `"off"` only once the model is
trusted and the run needs the speed or non-raising behaviour. A loose
default would hide that `"debug"` exists; a `"debug"` default would make
pylcm look slow.

Sweeps every solve/simulate call site in the test suite and docs to
pass `log_level` explicitly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop the `check_initial_conditions` parameter of `simulate()`. Initial-
conditions validation now follows the same `log_level` policy as the
transition checks: `"off"` skips it, `"warning"` / `"progress"` warn,
`"debug"` raises. One knob governs all runtime validation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The benchmark suite still passed the removed `check_initial_conditions`
keyword. The calls already pass `log_level="off"`, which skips
initial-conditions validation in the usual way.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hmgaudecker hmgaudecker requested review from mj023 and timmens and removed request for timmens May 21, 2026 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant