Skip to content

[Relax] Add affine_grid operator with PyTorch and ONNX frontend support#18933

Merged
tlopex merged 6 commits intoapache:mainfrom
Aharrypotter:relax-affine-grid
Mar 27, 2026
Merged

[Relax] Add affine_grid operator with PyTorch and ONNX frontend support#18933
tlopex merged 6 commits intoapache:mainfrom
Aharrypotter:relax-affine-grid

Conversation

@Aharrypotter
Copy link
Copy Markdown
Contributor

Summary

Add relax.image.affine_grid operator for Spatial Transformer Networks, along with PyTorch and ONNX frontend integration.

TOPI compute (topi.image.affine_grid) already exists. This PR completes the Relax-level registration and frontend support, following the existing resize2d / grid_sample pattern.

Changes

Relax op registration:

  • C++ op function, FFI registration, and struct info inference (resize.h, resize.cc)
  • Python wrapper with flexible size parameter handling (image.py)
  • Legalization to topi.image.affine_grid with PrimExprint conversion
  • Op-level tests (struct info inference + e2e numerical correctness) and legalization test

PyTorch frontend:

  • Converter for aten.affine_grid_generator.default
  • Layout conversion from TVM [N,2,H,W] to PyTorch [N,H,W,2] via permute_dims
  • Single-kernel path is 5.6x faster than the decomposed path (30+ ops)
  • Structural IR test and numerical correctness test

ONNX frontend:

  • AffineGrid converter with _impl_v20 (opset 20, when the op was first introduced)
  • Support for constant size tensor [N,C,H,W]
  • Layout conversion from TVM [N,2,H,W] to ONNX [N,H,W,2]
  • End-to-end correctness test against ONNX Runtime

Limitations

  • Only align_corners=True is supported (matches current TOPI implementation)
  • Only 2D affine grid is supported

Validation

pytest tests/python/relax/test_op_image.py -k "affine_grid" -v           # 8 passed
pytest tests/python/relax/test_transform_legalize_ops_image.py -k "affine_grid" -v  # 1 passed
pytest tests/python/relax/test_frontend_from_exported_program.py -k "affine_grid" -v  # 2 passed
pytest tests/python/relax/test_frontend_onnx.py -k "affine_grid" -v     # 1 passed

All 12 tests passed.

Aharrypotter and others added 3 commits March 26, 2026 00:55
Add relax.image.affine_grid operator with:
- C++ op function, FFI registration, and struct info inference
- Python wrapper with flexible size parameter handling
- Legalization to topi.image.affine_grid
- Op-level tests (struct info inference + e2e numerical correctness)
- Legalization test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add converter for aten.affine_grid_generator.default with:
- Layout conversion from TVM [N,2,H,W] to PyTorch [N,H,W,2]
- Structural IR test and numerical correctness test
- Requires run_ep_decomposition=False for single-kernel path (5.6x faster)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add AffineGrid converter with:
- Support for constant size tensor [N,C,H,W]
- Layout conversion from TVM [N,2,H,W] to ONNX [N,H,W,2]
- End-to-end correctness test against ONNX Runtime

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the affine_grid operator to Relax, enabling support for Spatial Transformer Networks within the TVM ecosystem. It provides the necessary infrastructure for the operator, including C++ and Python bindings, and integrates it with both PyTorch and ONNX frontends. This enhancement allows for more efficient and direct compilation of models utilizing affine grid generation, improving performance and expanding the range of supported models.

Highlights

  • Relax Operator Registration: The relax.image.affine_grid operator has been registered, including C++ op function, FFI registration, struct info inference, and a Python wrapper with flexible size parameter handling. Legalization to topi.image.affine_grid with PrimExpr to int conversion is also implemented.
  • PyTorch Frontend Integration: A converter for aten.affine_grid_generator.default has been added, including layout conversion from TVM's [N,2,H,W] to PyTorch's [N,H,W,2] via permute_dims. This single-kernel path is significantly faster than the decomposed path.
  • ONNX Frontend Integration: An AffineGrid converter with _impl_v20 (for opset 20) has been implemented, supporting constant size tensors [N,C,H,W] and handling layout conversion from TVM's [N,2,H,W] to ONNX's [N,H,W,2].
  • Limitations: Currently, only align_corners=True and 2D affine grids are supported, matching the existing TOPI implementation.
  • Validation: Comprehensive tests have been added for op-level functionality, struct info inference, e2e numerical correctness, and frontend conversions, with all 12 tests passing.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the affine_grid operator to TVM Relax, including its Python API, C++ backend implementation, and frontend converters for ONNX and PyTorch. Comprehensive tests for the operator, its legalization, and frontend integration have also been added. Feedback from the review suggests improving input validation in the ONNX frontend by using ValueError instead of assert, addressing potential runtime errors when converting symbolic PrimExpr to int during legalization, clarifying the implicit behavior of the size parameter in the Python API's docstring, and resolving or explaining the # type: ignore comment in the Python API.

raise NotImplementedError(f"Dynamic size of type {type(size)} is not supported")

# Only 2D is supported: size = [N, C, H, W]
assert len(size_vals) == 4, "Only 2D AffineGrid (size=[N,C,H,W]) is supported"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using assert for input validation is generally discouraged in production code, as assertions can be optimized out, leading to unexpected behavior. It's better to raise a ValueError or TVMError to ensure consistent error handling for invalid inputs.

Suggested change
assert len(size_vals) == 4, "Only 2D AffineGrid (size=[N,C,H,W]) is supported"
if len(size_vals) != 4:
raise ValueError("Only 2D AffineGrid (size=[N,C,H,W]) is supported")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense.


@register_legalize("relax.image.affine_grid")
def _image_affine_grid(bb: BlockBuilder, call: Call) -> Expr:
target_shape = [int(v) for v in call.args[1].values]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Converting PrimExpr values to int using int(v) can lead to runtime errors if v is a symbolic tir.Var that cannot be evaluated to a concrete integer at this stage. If topi.image.affine_grid requires concrete integer shapes, a check should be added to ensure v is not symbolic, or a more robust conversion mechanism should be used if topi can handle symbolic shapes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int(v) will crash with an unhelpful error on symbolic shapes. Consider adding an explicit check with a clear message, e.g.:

for v in call.args[1].values:                                                                                                      
    if not isinstance(v, (int, tir.IntImm)):                                                                                       
        raise ValueError(                                                                                                          
            "affine_grid legalization requires static target_shape, "                                                              
            f"got symbolic value: {v}"                          
        )       

Comment on lines +206 to +207
if isinstance(size, int | PrimExpr):
size = (size, size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The behavior of treating a single int or PrimExpr for size as (size, size) (i.e., square dimensions) is an implicit convention. It would be beneficial to explicitly document this behavior in the docstring for clarity, or add a comment here.

Suggested change
if isinstance(size, int | PrimExpr):
size = (size, size)
if isinstance(size, int | PrimExpr): # Assume square dimensions if a single value is provided
size = (size, size)

if isinstance(size, tuple | list):
size = ShapeExpr(size)

return _ffi_api.affine_grid(data, size) # type: ignore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The # type: ignore comment indicates that the type checker is reporting an issue. It's generally best to resolve type issues directly rather than ignoring them, as it can hide potential bugs. If the type system cannot express this correctly, a more specific explanation for the ignore would be helpful.

Copy link
Copy Markdown
Member

@tlopex tlopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thing you need to improve
InferStructInfoAffineGrid only checks ndim == 3 right now, but does not validate shape[1] == 2 and shape[2] == 3. Might be worth adding compile-time checks here so invalid inputs do not fail only at runtime.

align_corners=False is rejected in both ONNX and PyTorch frontends, but this is not mentioned in the Python docstring in image.py. Could we document this limitation explicitly?

The legalization test only verifies output shape at the moment. Could we also check structural equality against an expected TVMScript module, similar to test_resize2d?

raise NotImplementedError(f"Dynamic size of type {type(size)} is not supported")

# Only 2D is supported: size = [N, C, H, W]
assert len(size_vals) == 4, "Only 2D AffineGrid (size=[N,C,H,W]) is supported"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense.


@register_legalize("relax.image.affine_grid")
def _image_affine_grid(bb: BlockBuilder, call: Call) -> Expr:
target_shape = [int(v) for v in call.args[1].values]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int(v) will crash with an unhelpful error on symbolic shapes. Consider adding an explicit check with a clear message, e.g.:

for v in call.args[1].values:                                                                                                      
    if not isinstance(v, (int, tir.IntImm)):                                                                                       
        raise ValueError(                                                                                                          
            "affine_grid legalization requires static target_shape, "                                                              
            f"got symbolic value: {v}"                          
        )       

.set_attr<TMixedPrecisionPolicy>("TMixedPrecisionPolicy", MixedPrecisionPolicyKind::kFollow)
.set_attr<Bool>("FPurity", Bool(true));

/* relax.affine_grid */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong comment. Should be relax.image.affine_grid

@Aharrypotter
Copy link
Copy Markdown
Contributor Author

Updated in a follow-up commit.

  • Changed affine_grid legalization test to use structural equality
  • Clarified the affine_grid docstring for single-value size
  • Revalidated the related Relax image tests locally

@tlopex
Copy link
Copy Markdown
Member

tlopex commented Mar 26, 2026

Please make sure that CI is passed, if sometimes CI is down, you could retrigger CI. Btw, you can resolve the conflicts

Copy link
Copy Markdown
Member

@tlopex tlopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks

@tlopex tlopex merged commit 2f2469e into apache:main Mar 27, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants