Conversation
Signed-off-by: Jingyu Xin <jingyux@nvidia.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1166 +/- ##
==========================================
+ Coverage 74.28% 75.11% +0.83%
==========================================
Files 349 353 +4
Lines 39846 40122 +276
==========================================
+ Hits 29599 30139 +540
+ Misses 10247 9983 -264
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Jingyu Xin <jingyux@nvidia.com>
Signed-off-by: Jingyu Xin <jingyux@nvidia.com>
8151232 to
5873652
Compare
Signed-off-by: Jingyu Xin <jingyux@nvidia.com>
What does this PR do?
Type of change: new feature, new example
Summary
flash_skip_softmaxwith exponential model calibration (scale_factor = a * exp(b * sparsity))F.softmaxpatching) works on diffusion models that normally usescaled_dot_product_attentionforward_loop(required for non-LLM models)Changes
diffusers_triton_attention.py,diffusers_eager_attention.py,ltx_triton_attention.py,ltx_eager_attention.py— route diffusers/LTX attention through explicitF.softmaxfor calibrationkernels/__init__.py: Thread-local context management, lazy imports for diffusers/LTX backendsconversion.py: Auto-register diffusers backends onsparsify(), updated export config and summarycalibrate.py: Skip RULER dataset whenforward_loopis provided (enables diffusion model calibration)flash_skip_softmax.py: Enhanced context manager activates diffusers eager backendplugins/huggingface.py: Support diffusersModelMixinin model detectionltx2_skip_softmax.py,wan22_skip_softmax.pyUsage
Example scripts
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information