Skip to content

refac: Harmonize linopy operations and introduce a new predictable and strict convention#591

Open
FBumann wants to merge 81 commits intomasterfrom
harmonize-linopy-operations-mixed
Open

refac: Harmonize linopy operations and introduce a new predictable and strict convention#591
FBumann wants to merge 81 commits intomasterfrom
harmonize-linopy-operations-mixed

Conversation

@FBumann
Copy link
Collaborator

@FBumann FBumann commented Feb 20, 2026

Harmonize linopy arithmetic with legacy/v1 convention transition

Why: silent bugs in the current (legacy) arithmetic

linopy's current coordinate alignment has several classes of silent correctness bugs. None of them raise errors — they produce results that look reasonable but are quietly wrong.

1. Positional alignment ignores labels (#586, #550, #257)

When two operands have the same shape on a shared dimension, linopy matches them by position, ignoring coordinate labels entirely:

x = m.add_variables(coords=[["costs", "penalty"]], name="x")
factors = xr.DataArray([2.0, 1.0], dims=["effect"],
                       coords={"effect": ["penalty", "costs"]})
x * factors
# Result: x["costs"] * 2.0, x["penalty"] * 1.0  — SWAPPED!
# Expected: x["costs"] * 1.0, x["penalty"] * 2.0

This affects all arithmetic operators and constraint creation. The optimization solves without error, but with the wrong constraints.

2. Subset constants break associativity (#572)

When a constant has different-sized coordinates, a left-join drops coordinates that might be needed by a later operation:

# a has 3 time steps, b has 5, factor has 5
a + factor + b    # factor at time=3,4 → 40, 50  ✓
a + b + factor    # factor at time=3,4 → LOST     ✗

(a + factor) + b ≠ a + (b + factor) — the result depends on operand order.

3. User NaN silently swallowed (#620)

NaN in user-supplied data (often indicating missing or erroneous data) is silently filled with inconsistent neutral elements:

data = xr.DataArray([1.0, np.nan, 3.0], ...)
x + data    # NaN → 0 (silent)
x * data    # NaN → 0 (kills the variable!)
x / data    # NaN → 1 (leaves variable unchanged!)

The fill values differ across operations with no consistent principle.

4. Absent slots indistinguishable from zero (#620)

Variable.to_linexpr() does not mark absent variable slots (from .shift(), .where()) as NaN, so multiplication and fillna() cannot distinguish "absent" from "zero":

xs = x.shift(time=1)   # time=0 is absent
xs * 3                  # absent slot looks like 0 — not NaN
xs.fillna(42)           # no-op — nothing to fill

5. Inconsistent Variable vs Expression paths (#569, #571)

x * subset_constant and (1 * x) * subset_constant previously gave different results (one crashed, one didn't).


What: the v1 convention

This PR introduces linopy.options["arithmetic_convention"] with two modes and a non-breaking transition:

  • "legacy" (default) — reproduces current master behavior exactly. Emits LinopyDeprecationWarning on every legacy codepath.
  • "v1" — strict coordinate matching, explicit NaN handling. Silent bugs become loud errors.
linopy.options["arithmetic_convention"] = "v1"

Strict coordinate matching

Arithmetic operators (+, -, *, /) require matching coordinates on shared dimensions. Mismatched coordinates raise ValueError with suggestions:

x[i=0,1,2] + y[i=1,2,3]  # ValueError: use .add(y, join="inner")

Named methods with explicit join — .add(), .sub(), .mul(), .div(), .le(), .ge(), .eq() accept join= parameter:

x.add(y, join="inner")      # intersection
x.add(y, join="outer")      # union with fill
x.add(y, join="left")       # keep x's coordinates
x.add(y, join="override")   # positional alignment (opt-in only)

Named methods with fill_value — .add(), .sub(), .mul(), .div() accept fill_value= for explicit NaN filling before the operation (see PR #620):

expr.add(5, fill_value=0)      # fill NaN const with 0 before adding
expr.mul(factor, fill_value=1) # fill NaN const with 1 before multiplying

Free broadcasting — Constants can introduce new dimensions without restriction.

Algebraic laws

All standard algebraic laws hold under v1 and under legacy for same-coordinate operands:

  • Commutativity: a + b == b + a, a * c == c * a
  • Associativity: (a + b) + c == a + (b + c)
  • Distributivity: c * (a + b) == c*a + c*b, (a + b) / c == a/c + b/c
  • Identity, negation, zero

Legacy breaks associativity only when operands have mismatched coordinate ranges (the subset-constant case above). The v1 convention prevents this by requiring explicit coordinate handling.


v1 NaN convention

NaN means "absent term" — never a numeric value.

NaN enters only from mask= at construction or structural operations (.shift(), .where(), .reindex(), .reindex_like(), .unstack()). Operations like .roll(), .sel(), .isel() do not produce NaN.

Arithmetic with absent slots:

Operation Absent slot Why
shifted + 5 +5 Fills const with 0 (additive identity), required for associativity
shifted - 5 -5 Same — subtraction is addition of negation
shifted * 3 absent NaN propagates — no correct implicit fill (0 kills, 1 preserves)
shifted / 2 absent Same as multiplication

When merging expressions (e.g., x + y.shift(time=1)), NaN marks individual terms, not entire coordinates. A coordinate is only fully absent when all terms are absent — isnull() checks this.

User-supplied NaN raises ValueError — users must handle NaN explicitly before arithmetic:

x + data.fillna(0)           # NaN = "no offset"
x * factor.fillna(1)         # NaN = "no scaling"
expr.mul(3, fill_value=0)    # fill_value= shorthand on named methods

fillna()Variable.fillna(numeric) returns LinearExpression, Variable.fillna(Variable) returns Variable, Expression.fillna(value) fills const at absent slots.


Source changes

File Change
config.py LinopyDeprecationWarning, arithmetic_convention setting
expressions.py All arithmetic paths branch on convention; merge() pre-validates user-dim coords under v1; merge() preserves NaN const when all inputs NaN; _add_constant fills const with additive identity; to_constraint has separate legacy/v1 paths; NaN validation at API boundaries; fill_value= on .add()/.sub()/.mul()/.div()
common.py align() reads convention (legacy→inner, v1→exact)
variables.py Scalar fast path in __mul__, explicit TypeError in __div__, .reindex() methods, Variable.fillna(numeric) returns LinearExpression, to_linexpr() sets const=NaN at absent slots in v1
piecewise.py .fillna(0) on breakpoint data for v1 compatibility
monkey_patch_xarray.py DataArray/Dataset arithmetic with linopy types
model.py Convention-aware model methods

Documentation

Notebook Content
arithmetic-convention.ipynb Coordinate alignment rules, join parameter, migration guide
missing-data.ipynb NaN convention principles, fillna patterns, masking with .sel() and mask=, legacy comparison
_nan-edge-cases.ipynb Dev notebook: shift, roll, where, reindex, isnull, absent slot arithmetic, fillna/fill_value API, FILL_VALUE internals

Test structure

  • Marker-based separation: @pytest.mark.v1_only and @pytest.mark.legacy_only for convention-specific tests
  • Shared fixtures in conftest.py with auto-convention switching
  • test_algebraic_properties.py (92 tests): formal specification of algebraic laws — commutativity, associativity, distributivity, identity, negation, zero, division/subtraction distributivity, multi-step constant folding, mixed-type commutativity, expression-expression laws. All pass under both conventions except 8 NaN-propagation tests (v1 only).
  • test_legacy_violations.py (23 tests): catalog of concrete legacy bugs with paired legacy/v1 tests — positional alignment, subset associativity, user NaN handling, variable/expression inconsistency, absent-slot propagation. Each test traces to a specific issue number.
  • Additional convention-specific tests in test_linear_expression.py, test_constraints.py, test_convention.py

Rollout plan

  1. This PR: Default "legacy" — nothing breaks
  2. Downstream: Users opt in with linopy.options["arithmetic_convention"] = "v1"
  3. linopy v1: Flip default to "v1" and drop legacy mode

Open questions

  • from_tuples / linexpr() — Currently follows the global convention. In practice always called with same-coord variables, so convention doesn't matter. Low-priority.
  • Pipe operator — Only linopy objects, or also constants? (follow-up PR)

Sub-PRs

Test plan

  • All tests pass under both conventions
  • Legacy tests validate backward compatibility
  • v1 tests validate strict coordinate matching and NaN raises
  • Algebraic properties verified (92 tests): all laws hold under both conventions
  • Legacy violations documented (23 tests): concrete bugs with issue references
  • Piecewise and SOS tests run under both conventions
  • Documentation notebooks execute cleanly

🤖 Generated with Claude Code

FabianHofmann and others added 19 commits February 9, 2026 14:28
Add le(), ge(), eq() methods to LinearExpression and Variable classes,
mirroring the pattern of add/sub/mul/div methods. These methods support
the join parameter for flexible coordinate alignment when creating constraints.
Consolidate repetitive alignment handling in _add_constant and
_apply_constant_op into a single _align_constant method. This
eliminates code duplication and makes the alignment behavior
(handling join parameter, fill_value, size-aware defaults) testable
and maintainable in one place.
numpy_to_dataarray no longer inflates ndim beyond arr.ndim, fixing
lower-dim numpy arrays as constraint RHS. Also reject higher-dim
constant arrays (numpy/pandas) consistently with DataArray behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use "exact" join for +/- (raises ValueError on mismatch), "inner" join
for *// (intersection), and "exact" for constraint DataArray RHS.
Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq())
accept explicit join= parameter as escape hatch.

- Remove shape-dependent "override" heuristic from merge() and
  _align_constant()
- Add join parameter support to to_constraint() for DataArray RHS
- Forbid extra dimensions on constraint RHS
- Update tests with structured raise-then-recover pattern
- Update coordinate-alignment notebook with examples and migration guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 20, 2026

@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences.
Tests need some work though. And migration as well.
Looking forward to your opinion on the convention

FBumann and others added 2 commits February 20, 2026 13:51
…ords. Here's what changed:

  - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"])
  - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...)
  - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful
  - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...)
  - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...)
  - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...)
  - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...)
  - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...)
  - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...)
  - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...)
  - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant
  - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Why "exact" instead of "inner" for * and /

"exact" still broadcasts freely over dimensions that only exist on one side — it only enforces strict matching on shared dimensions. So the common scaling pattern works fine:

cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity  # dims: (tech=["wind", "solar"], region=["A", "B"])

cost * capacity  # ✓ tech matches exactly, region broadcasts freely

"inner" is dangerous: if coords on a shared dimension don't match due to a typo or upstream change, it silently drops values. The explicit and safe way to subset before multiplying is:

capacity.sel(tech=["wind", "solar"]) * renewable_cost

No operation should introduce new dimensions

Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to + and - as to * and / — new dimensions silently expand the optimization problem in unintended ways:

cost_expr      # dims: (tech, time)
regional_expr  # dims: (tech, time, region)

cost_expr + regional_expr  # ✗ silently expands to (tech, time, region)

capacity  # dims: (tech, region, time)
risk      # dims: (tech, scenario)
risk * capacity  # ✗ silently expands to (tech, region, time, scenario)

An explicit pre-check on all operations:

asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
    raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")

Summary

Operation Convention
+, -, *, / "exact" on shared dims; neither side may introduce dims the other doesn't have

@coroa
Copy link
Member

coroa commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Let's clearly differentiate between dimensions and labels.

labels

I agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions
strange at the beginning, but they grew on me:

x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0.

x + y.drop_extras() to say that you want an outer inner join.
x.drop_extras() + y does the same, though.

I have in a different project used | 0 to indicate keep_extras ie (x + y | 0).

dimensions

i am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe).

your first example actually implicitly assumes broadcasting.

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

Dimensions and broadcasting

I agree that auto broadcasting is helpful in some cases.
I'm happy with allowing broadcasting of constants. We could allow this always...?
But I would enforce that the constant never has more dims than the variable/expression.
Or is there a use case for this?

So the full convention requires two separate things:
1. "exact" join — shared dims must have matching coords (xarray handles this)
2. Subset dim check — the constant side’s dims must be a subset of the variable/expression (custom pre-check needed)

labels

I'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled.
I would rather enforce to reindex or fill data to the correct index.
I think aligning is the correct approach:

import linopy

# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned

# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_aligned

Combining disjoint expressions would then still need the explicit methods though.
I'm interested about your take on this

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

The proposed convention for all arithmetic operations in linopy:
1. "exact" join by default — shared coords must match exactly, raises on mismatch
2. Subset dim check — constants may introduce dimensions the variable/expression doesn’t have
3. No implicit inner join — use .sel() explicitly instead
4. Outer join with fill — use x + (y | 0) or .add(join="outer", fill_value=0)
The escape hatches in order of preference: .sel() for subsetting, | 0 for inline fill, named method .add(join=...) for everything else. No context manager needed.​​​​​​​​​​​​​​​​

I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

I thought about the pipe operator:
I think it should only work with linopy internal types (Variables/expression), not constants (scalar, numpy, pandas, dataarray), as this would need monkey patching a lot and hard to get stable.

Would this be an issue for you?

Use drop=True on scalar isel calls to prevent residual scalar coordinates
from causing exact-join mismatches under the v1 arithmetic convention.
Also align binary_hi coordinates with delta_lo in incremental PWL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 11, 2026

@brynpickering As i understand your comment, you are mostly ok with the convention, but have thoughts on nan data. Am I right? THis is what im currently not that opinionated about, so im happy to hear your suggestions.

FBumann and others added 5 commits March 12, 2026 08:22
NaN in linopy v1 means "absent term" — it marks individual terms as
missing without masking entire coordinates. User-supplied NaN at API
boundaries (constants, factors, constraint RHS) raises ValueError;
masking must be explicit via .sel() or mask=.

Implementation:
- FILL_VALUE["coeffs"] changed from NaN to 0 (structural "no term")
- NaN validation added in _add_constant, _apply_constant_op, to_constraint
- Piecewise internals use .fillna(0) on breakpoint data
- Tests updated to expect ValueError for NaN operands under v1

Key design decisions:
- NaN enters only via mask= or structural ops (shift, reindex, where)
- Combining expressions: absent terms do not poison valid terms
  (xr.sum skipna=True preserves valid contributions)
- A coordinate is fully absent only when ALL terms have vars=-1 AND
  const is NaN — this is what isnull() checks
- lhs >= rhs ≡ lhs - rhs >= 0, so RHS follows the same rules as constants

Documentation:
- New missing-data.ipynb: convention principles, fillna patterns,
  masking with .sel() and mask=, migration guide from legacy
- New nan-edge-cases.ipynb: investigation of shift, roll, where,
  reindex, isnull, arithmetic on shifted expressions, sanitize_missings
- arithmetic-convention.ipynb: updated to reference missing-data notebook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shift, where, reindex, reindex_like, unstack produce absent terms.
roll, sel, isel, drop_sel, expand_dims, broadcast_like do not.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Coeffs=0 was an implicit choice about the neutral element for
multiplication. NaN is more honest — it means "absent", which is
what FILL_VALUE is for. Both NaN and 0 coeffs get filtered by
filter_nulls_polars at solve time, so behavior is unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New cases: x + y.shift(1) + 5, x + (y+5).shift(1) + 5 (shifted
const is lost), x.shift(1) + y.shift(1) (fully absent coordinate).
Updated FILL_VALUE docs to reflect coeffs=NaN (not 0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 12, 2026

@brynpickering Thanks for the detailed feedback.

Fill values depend on context (efficiency=1 vs cost=0) — this is id like to make it a user choice via .fillna(value) rather than picking a default.

Misalignment doesn't have to raise. The escape hatches are lightweight:

total = cost_a.add(cost_b, join="outer")  # absent terms → zero at solve time
scaled = var * efficiency.fillna(1)        # you choose the fill

As far as I know, Memory overhead is the same either way — fillna() on a parameter is not much larger than the parameter itself, and join="outer" does alignment inside xarray's concat anyway. But maybe im wrong here.

For expressions with mismatched labels, join="outer" already works: absent terms carry NaN coefficients that are filtered out when flattening for the solver. No explicit fill value needed.

Maybe adding a fill_value= parameter on .mul() / .add() resolves your issue if the fillna + join= pattern proves too verbose in practice?

@FBumann FBumann requested review from FabianHofmann and coroa March 12, 2026 08:45
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 12, 2026

@coroa @FabianHofmann Im quite happy with this. But i think it desperately needs discussion.

@FBumann FBumann marked this pull request as ready for review March 12, 2026 08:46
@FBumann FBumann changed the title refac: Harmonize linopy operations with breaking convention refac: Harmonize linopy operations and introduce a new predictable and strict convention Mar 12, 2026
@brynpickering
Copy link

@FBumann your fillna suggestion is fine for parameters, but doesn't address nan filling later down the line of variables or expressions. As far as I understand linopy objects, it isn't possible to nan fill in the same way. This risks NaNs making their way into arithmetic when one would rather have some numeric value that has no effect to be there.

Rewrite notebook to be more concise: show expression objects directly
instead of printing individual fields, use markdown for explanations.
Add section addressing why fillna on expressions is unnecessary for
outer-join NaN (structural markers, not numeric gaps). Fix summary
table for x.shift(1)+y.shift(1) case (const=0, not NaN).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 14, 2026

@brynpickering

I added a (temporary?) notebook to document the behaviour more clearly. It would be great if you could give it a look and check if you find sth inconsistent.
The mental model of nan in linopy objects isnt trivial imo. The internal representation is nan, but it doesnt behave like nan!

As far as i understand, it boils down to as single design question
Take this setup:

import pandas as pd
import linopy

linopy.options["arithmetic_convention"] = "v1"
m = linopy.Model()
time = pd.RangeIndex(5, name="time")
x = m.add_variables(lower=0, coords=[time], name="x")

shiftintroduces an empty slot. Should arithmetics revive this (with a fill value defined by linopy (zero))?

expr_=  x.shift(time=1) + 2
print(expr)

Revived

LinearExpression [time: 5]:
---------------------------
[0]: +2
[1]: +5 x[0] + 2
[2]: +5 x[1] + 2
[3]: +5 x[2] + 2
[4]: +5 x[3] + 2

nan propagation

LinearExpression [time: 5]:
---------------------------
[0]: None
[1]: +5 x[0] + 2
[2]: +5 x[1] + 2
[3]: +5 x[2] + 2
[4]: +5 x[3] + 2

Did I understand you correctly?

Currently, Option A (no propagation) is implemented

FBumann and others added 6 commits March 14, 2026 18:18
When merging expressions where all input constants are NaN at a
coordinate (e.g. x.shift(1) + y.shift(1) at time=0), the const
sum with fill_value=0 turned NaN+NaN into 0+0=0, making the slot
appear as a valid zero expression instead of absent.

Detect all-NaN constants before filling and restore NaN after sum.

Also streamline _nan-edge-cases notebook with less verbose output,
more markdown, and a new section on why fillna on expressions is
unnecessary for outer-join NaN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix NaN propagation consistency and add fill_value API

Bug fix: Variable.to_linexpr() now sets const=NaN where labels=-1
in v1 mode, so Variable and Expression paths produce identical
results for shifted/masked variables.

New features:
- Variable.fillna(numeric) returns LinearExpression with constant fill
- .add()/.sub()/.mul()/.div() accept fill_value= parameter for
  explicit NaN filling before the operation

Design: bare operators (+, *, etc.) propagate NaN — absent stays
absent. Users choose how to revive via .fillna(value) or
.add(value, fill_value=). No implicit neutral element.

Tests: 21 new tests in test_algebraic_properties.py covering NaN
propagation, fillna, fill_value, and path consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Use additive identity (0) for NaN const fill in addition

Addition/subtraction now fill NaN const with 0 before operating,
both in v1 and legacy modes. This preserves associativity:
  (x.shift(1) + 5) + y == x.shift(1) + (5 + y)

Multiplication/division still propagate NaN — no implicit fill,
since the neutral element depends on context (0 kills, 1 preserves).
Use .fillna(value) or .mul(v, fill_value=) for explicit control.

Revised test_algebraic_properties.py as a thorough specification:
51 tests covering commutativity, associativity, distributivity,
identity, negation, zero, absent slot behavior (add revives,
mul propagates, merge semantics), fillna, and fill_value param.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… tests

NaN handling now preserves algebraic conventions (associativity,
distributivity, commutativity):

- Addition/subtraction fill const with 0 (additive identity),
  preserving (a + b) + c == a + (b + c)
- Multiplication/division propagate NaN — user decides via
  .fillna(value) or .mul(v, fill_value=)
- Variable.to_linexpr() sets coeffs=NaN and const=NaN at absent
  slots in v1 mode (consistent with expression path)
- merge() preserves const=NaN when all input constants are NaN
- Variable.fillna(numeric) returns LinearExpression
- .add()/.sub()/.mul()/.div() accept fill_value= parameter

62 algebraic property tests covering all conventions including
absent slots, division, subtraction, and multi-dimensional cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…butivity

Add 30 new v1-only tests covering gaps in algebraic property verification:
- Expression+expression commutativity and associativity (including shifted)
- Division distributivity: (a+b)/c == a/c + b/c
- Subtraction distributivity: c*(a-b) == c*a - c*b
- Negative scalar distributivity: -s*(a+b) == -s*a + (-s*b)
- Multi-step constant folding: (x+3)*2+1 == 2*x+7 and longer chains
- Mixed-type commutativity: Variable + Expression == Expression + Variable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ests

Only 8 tests that rely on v1's NaN propagation for absent slots need the
v1_only marker. All algebraic law tests (commutativity, associativity,
distributivity, identity, negation, zero, constant folding) pass under
both legacy and v1 conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents 5 categories of legacy issues with paired legacy/v1 tests:

1. Positional alignment (#586, #550): same-shape operands with different
   labels silently paired by position, producing wrong results
2. Subset constant associativity (#572): left-join drops coordinates,
   making (a+c)+b != a+(c+b)
3. User NaN swallowed (#620): NaN in user data silently filled with
   inconsistent neutral elements (0 for add/mul, 1 for div)
4. Variable vs Expression inconsistency (#569, #571): x*c and (1*x)*c
   previously gave different results
5. Absent slot propagation (#620): legacy can't distinguish absent
   variables from zero, fillna() is a no-op

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 14, 2026

@brynpickering I think the current state (updated description) should satisfy your concerns.

FBumann and others added 3 commits March 15, 2026 11:27
The arithmetic-convention notebook now covers the full NaN convention
inline: absent slot behavior table, arithmetic propagation rules,
user-NaN raises, fillna/sel patterns, and a code demo.

The missing-data notebook is repositioned as "Edge Cases and Details"
for NaN internals, propagation mechanics, and masking patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clarify that 84/92 algebraic tests pass under both conventions, and
the 8 v1-only tests are about NaN propagation semantics, not algebraic
law violations. Add division/subtraction/negative distributivity to the
properties table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Show both conventions side-by-side in the table: addition behaves the
same, but multiplication/division differs — v1 propagates NaN while
legacy silently fills with zero.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants