-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Generalize noop_with_empty_axes handling across all Reduce operators #26436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@microsoft-github-policy-service agree |
|
@microsoft-github-policy-service agree |
|
@microsoft-github-policy-service agree |
…h for all reduction operators) Fixes microsoft#26288 - Handle all Reduce* operators consistently when axes are empty and noop_with_empty_axes=1 - Apply both Pre-op and Post-op elementwise without performing reduction - Added comprehensive unit tests for ReduceSumSquare, ReduceL1, ReduceL2, ReduceLogSum, and ReduceLogSumExp covering empty-axes scenarios
…ith_empty_axes behavior
34ce393 to
70c5112
Compare
|
@skottmckay @snnn could you please review this PR (#26436)? |
Description
This PR fixes the behavior of the reduction operators so it's aligned with the ONNX specification.
See ONNX ReduceSumSquare Specification
for the definition of noop_with_empty_axes and expected behavior when axes=[] or when the axes input is not provided.
Main changes:
This function performs elementwise operations according to the aggregator type:
If the aggregator defines Pre/Post operations (e.g., abs, square, sqrt, log), they are applied elementwise on each element of the input without reduction.
Otherwise, a direct memory copy (memcpy) is performed to efficiently produce an identical output.
Introduced a compile-time trait system ReduceAggTraits to detect whether each aggregator defines a PreOp and/or PostOp.
This allows compile-time specialization and avoids redundant runtime checks.
Updated the generic reduction paths (CommonReduce1Loop and CommonReduce2Loops)
to invoke ApplyNoopEmptyAxesElementwise() when axes=[] and noop_with_empty_axes=1.
These paths are used by all reduction operators, ensuring consistent handling across the entire operators.
Fixed the conditional inside CommonFastReduceCopy: replaced the previous unconditional memcpy and return true with a safe return false, since empty-axes cases are now fully handled upstream by ApplyNoopEmptyAxesElementwise().
This keeps the control flow explicit and prevents unintended fallback copies.
Added unit tests for all affected reduction operators (ReduceSumSquare, ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp),and corrected one test that previously expected identity output but now correctly applies Pre/Post operations per the ONNX specification.
These updates ensure spec-compliant behavior for all Reduce ops and slightly improve performance for identity cases.
Motivation and Context
Before this fix, some Reduce operators (e.g., ReduceSumSquare, ReduceL1, ReduceL2, ReduceLogSum, ReduceLogSumExp) did not follow the ONNX spec when axes=[] and noop_with_empty_axes=1.Per the ONNX specification:
No reduction should occur if axes is empty and noop_with_empty_axes=1.
Operators with Pre/Post (e.g., abs, square, sqrt, log) must apply them elementwise.
Others should return the input unchanged.
Fixes #26288
Fixes #25095