Skip to content

feat: multiple optimization profiles for disjoint input shape regimes#4325

Draft
cehongwang wants to merge 1 commit into
abose/dynamic-shapes-passthroughfrom
cehongw/multi-optimization-profile
Draft

feat: multiple optimization profiles for disjoint input shape regimes#4325
cehongwang wants to merge 1 commit into
abose/dynamic-shapes-passthroughfrom
cehongw/multi-optimization-profile

Conversation

@cehongwang

Copy link
Copy Markdown
Collaborator

Add support for defining N optimization profiles at compile time via the list-based Input.profiles API and selecting the active profile at runtime (manual pin by index, or opt-in shape-based auto-selection).

  • AOT (torch.export) compile path builds one TRT optimization profile per declared profile index; submodules inherit the profile count via propagation across graph breaks.
  • Python and C++ runtimes expose a matching primitive engine API (set_active_profile / num_optimization_profiles / _active_profile_index / _auto_select_profiles) so the two runtimes remain interchangeable.
  • Profile selection is exposed through the optimization_profile context manager; auto-selection uses lazy (first-fitting) profile selection.
  • Backward compatible: engines without declared profiles keep the historical single-profile (dynamic) / no-profile (static) behavior.

Includes an example and runtime tests covering dynamic submodule inputs.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@meta-cla meta-cla Bot added the cla signed label Jun 8, 2026
@github-actions github-actions Bot added component: tests Issues re: Tests component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: api [Python] Issues re: Python API component: runtime component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jun 8, 2026
@cehongwang cehongwang force-pushed the cehongw/multi-optimization-profile branch from f32fed3 to 427643d Compare June 8, 2026 23:32
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 8, 2026
Add support for defining N optimization profiles at compile time via the
list-based ``Input.profiles`` API and selecting the active profile at
runtime (manual pin by index, or opt-in shape-based auto-selection).

- AOT (torch.export) compile path builds one TRT optimization profile per
  declared profile index; submodules inherit the profile count via
  propagation across graph breaks.
- Python and C++ runtimes expose a matching primitive engine API
  (set_active_profile / num_optimization_profiles / _active_profile_index /
  _auto_select_profiles) so the two runtimes remain interchangeable.
- Profile selection is exposed through the optimization_profile context
  manager; auto-selection uses lazy (first-fitting) profile selection.
- Backward compatible: engines without declared profiles keep the historical
  single-profile (dynamic) / no-profile (static) behavior.

Includes an example and runtime tests covering dynamic submodule inputs.
@cehongwang cehongwang force-pushed the cehongw/multi-optimization-profile branch from 427643d to 2cd4797 Compare June 9, 2026 00:24
@cehongwang cehongwang requested review from apbose and narendasan June 9, 2026 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: runtime component: tests Issues re: Tests documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant