[Tracking Issue] Add missing vision/image operators

### Background                                                                                                                     
                                               
Since we phased out Relay in early 2025,   several vision/image operators that Relay supported are missing from Relax, blocking deployment of common detection and segmentation models (Faster R-CNN, Mask R-CNN,     Spatial Transformer Networks, etc.).     

Currently Relax only has 3 vision/image ops: `all_class_non_max_suppression`, `resize2d`, `grid_sample`.

### Scope                                                                                                                          
                                                                  
Each operator needs:
1. **TOPI compute** (`python/tvm/topi/`) — if not already present
2. **Relax op registration** — C++ attrs + struct info inference (`src/relax/op/`, `include/tvm/relax/attrs/`) + Python wrapper (`python/tvm/relax/op/`)                                                                                                           
3. **Legalization** (`python/tvm/relax/transform/legalize_ops/`)                                                                   
4. **Frontend integration** — update ONNX/PyTorch/TFLite frontends to emit the new op                                              
5. **Tests** — op-level unit tests + frontend integration tests                     


Steps 1–3 (Relax op) and step 4 (frontend) can be done in separate PRs or combined — contributors can choose based on scope.                                                 
                                                                                                                                     
### Operators                                                                                                                      
                                                                                                                                     
#### Tier 1 — Unblocks mainstream detection/segmentation models 

- [x] `roi_align` — Core op for Faster R-CNN, Mask R-CNN, Cascade R-CNN. TOPI was removed during te.schedule phase-out, needs full reimplementation.
- [x] `affine_grid` — Generates sampling grid for Spatial Transformer Networks. TOPI already exists (`topi.image.affine_grid`), only needs Relax op + legalization. Pairs with existing `grid_sample`.                                                             
  
#### Tier 2 — Fills domain gaps and fixes broken paths                                                                             
                                                                  
- [x] `resize3d` — 3D volume resize for medical imaging (CT/MRI) and video. TOPI already exists (`topi.image.resize3d`), only needs Relax op + legalization. Note: ONNX frontend currently works around this by calling TOPI directly.
- [ ] `get_valid_counts` — Score-based bounding box filtering. Filters out low-score boxes and returns valid count per batch. Current TOPI is a no-op stub; needs full implementation.
- [ ] `non_max_suppression` (classic) — Flexible single-class NMS. Performs IoU-based suppression on filtered boxes from get_valid_counts. TOPI was removed; needs reimplementation. Complements existing all_class_non_max_suppression for custom post-processing pipelines. 
- [x] `multibox_transform_loc` — Decodes bounding box predictions using anchor priors + predicted offsets, with score thresholding. Only needed by TFLite DETECTION_POSTPROCESS. No existing TOPI or Relax op; needs full implementation.                                                          
                                                                  
#### Tier 3 — Frontend integration

- [x] ONNX `RoiAlign`: add relax.op.vision.roi_align op definition + legalization, implement converter class, uncomment registration, add tests
- [x] ONNX `AffineGrid`: add relax.op.image.affine_grid op definition + legalization, implement converter class, add tests
- [x] ONNX `Resize 5-D`: add relax.op.image.resize3d op definition + legalization, replace bb.emit_te workaround in converter with relax.op.image.resize3d, add tests
- [x] ONNX `GridSample`: implement converter class (relax op already exists), uncomment registration, add tests
- [x] PyTorch `torchvision.ops.roi_align`: register converter (depends on relax.op.vision.roi_align), add tests
- [x] PyTorch `torch.nn.functional.affine_grid`: register converter (depends on relax.op.image.affine_grid), add tests
- [x] PyTorch `torch.nn.functional.interpolate` 3D mode: add 5-D branch in converter using relax.op.image.resize3d (depends on resize3d op), add tests
- [ ] TFLite `NON_MAX_SUPPRESSION_V5`: verify/add get_valid_counts + non_max_suppression relax ops, implement converter, add tests
- [ ] TFLite `DETECTION_POSTPROCESS`: implement converter (anchor decoding + NMS), depends on same relax ops as above, add tests       
                                                                                      
### Implementation reference                                                                                                       
                                                                                                                                     
- Existing vision op pattern: `src/relax/op/vision/nms.cc` + `python/tvm/relax/op/vision/nms.py`                                   
- Existing image op pattern: `src/relax/op/image/resize.cc` + `python/tvm/relax/op/image/image.py`
- Legalization: `python/tvm/relax/transform/legalize_ops/vision.py`, `image.py`                                                    
- `roi_align` reference implementation for testing: `python/tvm/topi/testing/roi_align_python.py`                                  
- Historical Relay implementations as design reference (Relax has a different IR design, so these should not be copied directly): `v0.19.0` tag                                                                               
                                                                                                                                     
`affine_grid` and `resize3d` are good first issues — TOPI already exists, just follow the existing `resize2d` / `grid_sample` pattern. If you're interested in contributing, please comment below to claim a task before starting work. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking Issue] Add missing vision/image operators #18928

Background

Scope

Operators

Tier 1 — Unblocks mainstream detection/segmentation models

Tier 2 — Fills domain gaps and fixes broken paths

Tier 3 — Frontend integration

Implementation reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Tracking Issue] Add missing vision/image operators #18928

Description

Background

Scope

Operators

Tier 1 — Unblocks mainstream detection/segmentation models

Tier 2 — Fills domain gaps and fixes broken paths

Tier 3 — Frontend integration

Implementation reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions