Replace models with composite calculator pipeline by zubatyuk · Pull Request #34 · NVIDIA/nvalchemi-toolkit

zubatyuk · 2026-03-31T01:24:10Z

ALCHEMI Toolkit Pull Request

Description

This PR replaces the previous monolithic nvalchemi.models wrapper architecture with a composable calculator API. The main motivation is to make the physical pipeline explicit instead of hiding it inside wrappers. The central model flow is now naturally expressed as a sequence of steps, like NL -> MLIP -> NL -> Coulomb -> AutoGrad, with direct physical terms such as DFT-D3 and DSF fitting into the same ordered step model. This improves both composability and correctness. In the previous monolithic implementation, compositions like (NL -> MLIP -> AutoGrad) -> Coulomb produced incorrect force semantics for charge-coupled Coulomb terms. The new pipeline implementation makes the derivative path explicit and easier to reason about.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Performance improvement
Documentation update
Refactoring (no functional changes)
CI/CD or infrastructure change

Related Issues

#22

Changes Made

Models: replace the old wrapper-oriented nvalchemi.models surface with CompositeCalculator, EnergyDerivativesStep, cards / profiles / metadata / results / registry, and concrete built-in calculator steps.
Models: make force and stress derivation explicit through EnergyDerivativesStep, keep DSF as the hybrid Coulomb term, and expose PME / Ewald direct vs autograd participation explicitly.
Models: make external neighbor requirements explicit through neighbor_requirement and expose neighbor_list_builder_config(...) as the user-facing bridge from contract to builder config.
Models: add registry-backed artifact resolution for known names such as aimnet2 and mace-mp-0b3-medium.
Dynamics: update dynamics to consume calculator-style models directly instead of BaseModelMixin, and replace the demo wrapper path with DemoPotential.
Docs and tests: replace the public docs entrypoints so reviewers see the new nvalchemi.models story in docs/userguide/models.md, docs/modules/models.rst, and docs/models/index.md, and replace the old wrapper-oriented test/models/* architecture surface with a smaller root-package test surface.

Testing

Unit tests pass locally (PYTHONDONTWRITEBYTECODE=1 .venv/bin/pytest -q test/models/test_composable.py test/models/test_mace.py test/models/test_neighbor_filter.py test/models/test_physical_models.py test/models/test_direct_potentials.py test/models/test_registry.py)
Linting passes (make lint)
New tests added for new functionality meets coverage expectations?

Checklist

I have read and understand the Contributing Guidelines
I have updated the CHANGELOG.md
I have performed a self-review of my code
I have added docstrings to new functions/classes
I have updated the documentation (if applicable)

Additional Notes

The main change is the explicit pipeline model. Neighbor lists and cached tensors k_vectors / k_squared can already be reused in the new pipeline implementation when they are present on the incoming batch / inputs and the step is configured to reuse them. In the previous monolithic implementation, that state was stored and invalidated inside the wrapper. In the new pipeline implementation, roles are separated more cleanly: calculators consume and optionally reuse provided state, while hooks are the natural place to manage invalidation.

One important current limitation remains: stages exchange named values by key, but there is no enforced shape / dtype contract between stages yet.

The implementation is also intentionally future-proof in a few important ways. The same composite model structure can support training-oriented objectives that include charges, forces, and stresses in the loss. The step/result model also leaves room for MLIP outputs beyond energies, such as embeddings and other learned properties, without redesigning the pipeline again.

Tip

This repository uses Greptile, an AI code review service, to help conduct
pull request reviews. We encourage contributors to read and consider suggestions
made by Greptile, but note that human maintainers will provide the necessary
reviews for merging: Greptile's comments are not a qualitative judgement
of your code, nor is it an indication that the PR will be accepted/rejected.
We encourage the use of emoji reactions to Greptile comments, depending on
their usefulness and accuracy.

Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

copy-pr-bot · 2026-03-31T01:24:13Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-03-31T01:32:43Z

Greptile Summary

This PR replaces the monolithic nvalchemi.models wrapper architecture with an explicit composable calculator pipeline. The central idea is that the physical computation is now expressed as an ordered sequence of _CalculationStep instances — e.g. [NeighborListBuilder, MACEPotential, EnergyDerivativesStep] — collected inside a CompositeCalculator, which handles backward dependency analysis, additive result merging, and derivative-target setup in a single forward() pass. The dynamics layer is updated in parallel to consume the new DynamicsCalculator protocol instead of BaseModelMixin.

Core pipeline (composite.py, autograd.py, base.py): CompositeCalculator._resolve_child_requests performs backward dependency analysis and additive-key expansion cleanly. EnergyDerivativesStep and the ensure_derivative_targets helper make the autograd differentiation path explicit.
Contracts (contracts.py): Layered Pydantic models (StepCard → StepProfile, PotentialCard → PotentialProfile, NeighborListCard → NeighborListProfile) provide frozen, validated per-instance contracts with clear inheritance.
Registry (registry.py): Refactored to a family-keyed registry with ArtifactProcessor/ArtifactResolver protocols; SHA-256 streaming is in place but MD5 verification still loads entire files via read_bytes().
ModelOutputs type alias (_typing.py): Hessian is silently removed from the union and NodeCharges | NodeSpins is added — this is a breaking change not called out in the PR description.
_validate_external_inputs (composite.py): Uses hasattr(batch, key) for dotted neighbor-list keys, which always returns False in Python and can raise false KeyErrors.

Important Files Changed

Filename	Overview
nvalchemi/_typing.py	`ModelOutputs` type alias drops `Hessian` and adds `NodeCharges
nvalchemi/models/composite.py	New `CompositeCalculator` implementing the ordered pipeline composition; backward dependency analysis and additive-key expansion are well-structured, but `_validate_external_inputs` uses `hasattr` for dotted neighbor-list keys which will not resolve correctly if data is stored directly on the batch.
nvalchemi/models/autograd.py	New `EnergyDerivativesStep` that materialises forces and stresses from accumulated energy via `torch.autograd.grad`; derivative target setup is delegated cleanly to the `prepare()` phase; no issues found.
nvalchemi/models/contracts.py	New Pydantic-based step/potential/neighbor-list card and profile hierarchy with cross-field validators; definitions are clean and internally consistent.
nvalchemi/models/base.py	Major refactoring from `BaseModelMixin` to `_CalculationStep`/`Potential`/`_NeighborListStep`; introduces `RuntimeState`, `ForwardContext`, and the singleton `_UNSET` sentinel; solid implementation with no critical issues found.
nvalchemi/models/registry.py	Registry refactored from flat `ModelRegistryEntry` to family-keyed `KnownArtifactEntry` with `ArtifactProcessor`/`ArtifactResolver` protocols; SHA-256 uses streaming but MD5 verification uses `read_bytes()` which loads entire files into memory.
nvalchemi/models/results.py	New thin `MutableMapping`-backed `CalculatorResults` container with additive merge semantics; straightforward and correct.
nvalchemi/dynamics/base.py	Replaces `BaseModelMixin` dependency with `DynamicsCalculator` protocol; `model_is_conservative` uses string-name matching (flagged in prior threads); validation weakened to `callable()` (also in prior threads); clean overall migration.
nvalchemi/models/init.py	Switches from `TYPE_CHECKING`-guarded lazy imports to eager imports; safe because optional third-party deps (`aimnet`, `mace`) are deferred inside the respective class constructors, not at module level.
nvalchemi/models/dsf.py	New DSF Coulomb potential step; correctly dispatches between COO and matrix neighbor formats, handles periodic/non-periodic cases, and computes virial-to-stress conversion; no issues found.
nvalchemi/models/neighbors.py	New explicit neighbor-list builder step with `NeighborListBuilderConfig`/`AdaptiveNeighborListConfig`; reuse-if-available optimisation and dot-namespaced result keys are well-implemented.

_{Reviews (2): Last reviewed commit: "docs: add design codument" | Re-trigger Greptile}

greptile-apps · 2026-03-31T01:32:46Z

nvalchemi/models/composite.py

+        all_resolved: set[str] = set()
+        for request in raw_requests:
+            all_resolved |= request
+        expand_keys = all_resolved & self.pipeline_contract.additive_result_keys
+
+        if not expand_keys:
+            return raw_requests
+
+        expanded_requests: list[frozenset[str]] = []
+        for request, step in zip(raw_requests, steps, strict=True):
+            step_expand = expand_keys & step.profile.result_keys
+            expanded_requests.append(request | step_expand)


Additive-key expansion causes force double-counting in hybrid pipelines

The expansion logic on lines 293–301 asks every step that can produce an additive key (e.g. "forces") to do so, regardless of whether an EnergyDerivativesStep already accounts for that step's energy contribution.

Concrete scenario — pipeline [NL_mlip, MLIP, NL_dsf, DSF, AutoGrad] with active_outputs = {"forces", "energies"}:

Backward resolution assigns {"forces"} to AutoGrad and {"energies"} to DSF; MLIP is re-added via expansion only.

expand_keys = {"forces", "energies"} ∩ additive_result_keys → {"forces", "energies"}.

Because "forces" is in DSFCoulombPotential.profile.result_keys, DSF is expanded to produce direct forces (compute_forces=True inside DSFCoulombPotential.compute).

MLIP's prepare() writes gradient-tracked positions_scaled into runtime_state.input_overrides["positions"]. DSF calls require_input(batch, "positions", ctx), which resolves to positions_scaled, so DSF's energy is part of the autograd graph.

EnergyDerivativesStep differentiates the accumulated total energy (E_mlip + E_dsf) through positions_scaled, producing forces −∂(E_mlip + E_dsf)/∂r.

The additive merge then sums DSF's direct forces with the autograd forces:

F_final = F_dsf_direct + F_autograd(E_mlip + E_dsf) ≈ F_dsf + F_mlip + F_dsf = 2 × F_dsf + F_mlip

The same double-counting occurs for stresses.

The root cause is that the expansion step does not know whether a given step's energy is already captured in the autograd graph of a later derivative step. Consider distinguishing "energy-only" mode (when an EnergyDerivativesStep follows) from "direct-output" mode so that a direct-force potential can participate in one way or the other, but not both simultaneously.

In the current implementation, the reported double-counting does not occur for DSF or Ewald as implemented. Correctness of differentiation is intentionally the responsibility of the calculator-step author, not something the composite planner infers automatically or implicitly.

greptile-apps · 2026-03-31T01:32:47Z

nvalchemi/dynamics/base.py

    def model_is_conservative(self) -> bool:
-        """Returns whether or not the model uses conservative forces"""
-        return self.model_card.forces_via_autograd
+        """Return a best-effort conservative-force indicator."""
+
+        steps = getattr(self.model, "steps", None)
+        if steps is None:
+            return False
+        return any(
+            type(step).__name__ == "EnergyDerivativesStep" for step in steps.values()
+        )


model_is_conservative uses fragile class-name string matching

type(step).__name__ == "EnergyDerivativesStep"

This returns False for any subclass of EnergyDerivativesStep, and also returns False for non-CompositeCalculator callables that happen to derive conservative forces through other means, since getattr(self.model, "steps", None) would be None.

A more robust approach is to use isinstance(). The only thing preventing it is a potential circular import (autograd.py → base.py), which can be handled with a TYPE_CHECKING-guarded or deferred import:

@property def model_is_conservative(self) -> bool: from nvalchemi.models.autograd import EnergyDerivativesStep # deferred steps = getattr(self.model, "steps", None) if steps is None: return False return any(isinstance(step, EnergyDerivativesStep) for step in steps.values())

greptile-apps · 2026-03-31T01:32:48Z

nvalchemi/dynamics/base.py

            in the MRO (for cooperative multiple inheritance).
        """
        super().__init__(**kwargs)
-        if not isinstance(model, BaseModelMixin):
+        if not callable(model):
            raise TypeError(
-                f"Expected a `BaseModelMixin` instance, got {type(model).__name__}."
-                " Please wrap your model with a `BaseModelMixin` subclass."
+                f"Expected a callable dynamics calculator, got {type(model).__name__}."
            )
        self.model = model


Validation weakened from structural check to callable() only

The old guard was isinstance(model, BaseModelMixin), which verified the full structural contract (model card, compute_embeddings, etc.). The replacement is:

if not callable(model): raise TypeError(...)

DynamicsCalculator is already @runtime_checkable, so it supports isinstance checks at runtime (verifying both the model_card attribute and the __call__ signature). Using it here would keep validation meaningful without re-introducing the old coupling:

if not isinstance(model, DynamicsCalculator): raise TypeError( f"Expected a DynamicsCalculator, got {type(model).__name__}. " "The model must be callable and expose a `model_card` attribute." )

greptile-apps · 2026-03-31T01:32:49Z

nvalchemi/dynamics/base.py

+    def __call__(
+        self,
+        batch: Batch,
+        *,
+        outputs: Iterable[str] | None = None,
+    ) -> ModelOutputs:
+        """Compute model outputs for one batch."""
+        ...


Protocol return type annotation is structurally inconsistent with CompositeCalculator

DynamicsCalculator.__call__ is declared to return ModelOutputs (an OrderedDict[str, Tensor | None]), but CompositeCalculator.forward returns CalculatorResults (a MutableMapping[str, Any]). While both types support .get() at runtime and the dynamics code works correctly, static type checkers will flag any CompositeCalculator instance as not conforming to DynamicsCalculator. Updating the protocol to use the common base type would make both nominal and structural conformance correct:

Suggested change

def __call__(

self,

batch: Batch,

*,

outputs: Iterable[str] | None = None,

) -> ModelOutputs:

"""Compute model outputs for one batch."""

...

def __call__(

self,

batch: Batch,

*,

outputs: Iterable[str] | None = None,

) -> Mapping[str, Any]:

"""Compute model outputs for one batch."""

...

…cstrings Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

greptile-apps · 2026-03-31T13:52:23Z

nvalchemi/_typing.py

 Fmax: TypeAlias = Float[torch.Tensor, "B 1"]  # noqa: F722
 ModelOutputs: TypeAlias = OrderedDict[
-    str, Energy | Forces | Hessian | Stress | Virials | Dipole | None
+    str, Energy | Forces | Stress | Virials | Dipole | NodeCharges | NodeSpins | None


Replace models with composite calculator pipeline

80e6f05

Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

greptile-apps bot reviewed Mar 31, 2026

View reviewed changes

zubatyuk added 3 commits March 30, 2026 22:46

refactor(models): deduplicate base logic in MLIPs, add NumPy-style do…

27f2475

…cstrings Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

docs(models): add NumPy-style docstrings; demove composable

6503fe2

docs: add design codument

ae9f70e

Signed-off-by: Roman Zubatyuk <rzubatiuk@nvidia.com>

greptile-apps bot reviewed Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace models with composite calculator pipeline#34

Replace models with composite calculator pipeline#34
zubatyuk wants to merge 4 commits intoNVIDIA:mainfrom
zubatyuk:models-composite-pipeline

zubatyuk commented Mar 31, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 31, 2026

Uh oh!

greptile-apps bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

zubatyuk Mar 31, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zubatyuk commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ALCHEMI Toolkit Pull Request

Description

Type of Change

Related Issues

Changes Made

Testing

Checklist

Additional Notes

Uh oh!

copy-pr-bot bot commented Mar 31, 2026

Uh oh!

greptile-apps bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

zubatyuk Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zubatyuk commented Mar 31, 2026 •

edited

Loading

greptile-apps bot commented Mar 31, 2026 •

edited

Loading

zubatyuk Mar 31, 2026 •

edited

Loading