-
Notifications
You must be signed in to change notification settings - Fork 74
skip unsupported arches in python tests #5760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
!test |
Greptile SummaryThis PR restricts FP4 quantization and GEMM tests to specific GPU architectures. A new Key changes:
Main issue: Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Test as Test Suite
participant Utils as direct_utils/utils.py
participant Device as CUDA Device
Note over Test,Device: Test Execution Flow
Test->>Utils: Check architecture support
alt Using microarchitecture_is(10, 0)
Utils->>Device: Get device properties (major, minor)
Device-->>Utils: Return (10, 0) or other
Utils-->>Test: Return True only if exactly 10.0
Note over Test: Skip if not 10.0<br/>(test_cutlass_nvfp4_gemm.py,<br/>test_with_id_model_indexer.py,<br/>some tests in test_narrow_precision.py)
else Using is_blackwell()
Utils->>Device: Check multiple compute capabilities
Device-->>Utils: Return compute capability
Utils-->>Test: Return True if 10.0, 10.3, 12.0, or 12.1
Note over Test: Skip if not any Blackwell variant<br/>(some tests in test_narrow_precision.py)
end
alt Architecture supported
Test->>Test: Execute test (FP4 quantization, GEMM, etc.)
else Architecture not supported
Test->>Test: Skip test with reason
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 files reviewed, 1 comment
| @pytest.mark.skipif( | ||
| is_pre_blackwell(), reason="Only supported on blackwell and newer devices." | ||
| ) | ||
| @pytest.mark.skipif(not is_blackwell(), reason="Only supported on blackwell.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Inconsistent architecture filtering: this test uses is_blackwell() (supports 10.0, 10.3, 12.0, 12.1) while other tests in this same file use microarchitecture_is(10, 0) (only supports 10.0). This creates different test coverage across Blackwell variants.
|
Review updated until commit 55b5572 Description
|
| Relevant files | |||||||
|---|---|---|---|---|---|---|---|
| Tests |
| ||||||
| Enhancement |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Architecture Check Logic
microarchitecture_is(10, 0) check. Verify that this change correctly identifies the intended architectures and that the skip reason "Does not support blackwell compute 12.0, other arches are not tested" accurately reflects the test's requirements. |
Test failures
-
(Low, 1)
Minor numerical mismatch in Thunder vs Torch instance_norm cuda testsTest Name A100 Source thunder.tests.test_ops.test_core_vs_torch_consistency_instance_norm_nvfuser_cuda_thunder.dtypes.float32 ❌
|
The title suggests this has something to do with indexing. Can you clarify what it is? |
test_with_id_model_indexer.py
The title was not accurate, failure is not realated to indexing. Revised to |
|
@greptileai review this draft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 files reviewed, 4 comments
| @pytest.mark.skipif( | ||
| is_pre_blackwell(), reason="Only supported on blackwell and newer devices." | ||
| ) | ||
| @pytest.mark.skipif( | ||
| not microarchitecture_is_pre(12), reason="Does not support blackwell compute 12.0" | ||
| not microarchitecture_is(10, 0), | ||
| reason="Does not support blackwell compute 12.0, other arches are not tested.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Inconsistent architecture filtering within same file: this test uses microarchitecture_is(10, 0) (only supports 10.0), but other tests in this file (lines 66, 173, 320) use is_blackwell() which supports all Blackwell variants (10.0, 10.3, 12.0, 12.1). Are the _scaled_mm tests specifically incompatible with 12.x architectures while quantization tests support them? Do scaled_mm operations have specific incompatibilities with 12.x compute capabilities that quantization operations don't have?
| @pytest.mark.skipif( | ||
| is_pre_blackwell(), reason="Only supported on blackwell and newer devices." | ||
| ) | ||
| @pytest.mark.skipif( | ||
| not microarchitecture_is_pre(12), reason="Does not support blackwell compute 12.0" | ||
| not microarchitecture_is(10, 0), | ||
| reason="Does not support blackwell compute 12.0, other arches are not tested.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Same inconsistency: microarchitecture_is(10, 0) here vs is_blackwell() at line 320 for similar grouped matmul tests
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @pytest.mark.skipif( | ||
| is_pre_blackwell(), reason="Only supported on blackwell and newer devices." | ||
| ) | ||
| @pytest.mark.skipif( | ||
| not microarchitecture_is_pre(12), reason="Does not support blackwell compute 12.0" | ||
| not microarchitecture_is(10, 0), | ||
| reason="Does not support blackwell compute 12.0, other arches are not tested.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Same pattern: microarchitecture_is(10, 0) here vs is_blackwell() elsewhere in file
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @pytest.mark.skipif( | ||
| is_pre_blackwell(), reason="Only supported on blackwell and newer devices." | ||
| ) | ||
| @pytest.mark.skipif( | ||
| not microarchitecture_is_pre(12), reason="Does not support blackwell compute 12.0" | ||
| not microarchitecture_is(10, 0), | ||
| reason="Does not support blackwell compute 12.0, other arches are not tested.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Same pattern: microarchitecture_is(10, 0) here vs is_blackwell() elsewhere in file
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
test_with_id_model_indexer.py|
!test |
|
Summary of failure reasons:
|
skip unsupported arches