[#15344][fix] Add SM121 to MLA allowlists#15434
Conversation
Signed-off-by: peter941221 <peter941221@gmail.com>
Signed-off-by: peter941221 <peter941221@gmail.com>
2c89f99 to
cfb27eb
Compare
|
/bot run |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughSM121 is added to the allowed SM version sets in two MLA compatibility guards inside ChangesSM121 MLA Feature Support
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/bot run |
|
PR_Github #54790 [ run ] triggered by Bot. Commit: |
|
PR_Github #54790 [ run ] completed with state
|
Signed-off-by: peter941221 <peter941221@gmail.com>
|
Fixed in c026457. The failure on cfb27eb was the pre-commit gate, not a behavior regression:
I pushed the formatter output only. No semantic change. An NVIDIA team member will need to rerun CI. |
Description
py_executor_creator.pytreats SM121 as unsupported for both MLA KV cache reuse and MLA chunked prefill. That fallback disables both features.This patch adds SM121 to both MLA allowlists so SM121 follows the same MLA path as SM120.
The unit coverage in
tests/unittest/_torch/executor/test_py_executor_creator_mla_cache_reuse_sync.pynow checks both SM121 gates:test_mla_sm121_supported_configuration_preserves_cache_reusetest_mla_sm121_supported_configuration_preserves_chunked_prefillTest Coverage
python -m py_compile tensorrt_llm/_torch/pyexecutor/py_executor_creator.py tests/unittest/_torch/executor/test_py_executor_creator_mla_cache_reuse_sync.pytests/unittest/_torch/executor/test_py_executor_creator_mla_cache_reuse_sync.pyPR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.Summary by CodeRabbit
New Features
Tests