Skip to content

Conversation

@hila-f-qodo
Copy link

@hila-f-qodo hila-f-qodo commented Dec 11, 2025

User description

Splits each torch library registration in the 2.10 folder into its own file -- I had a script that parsed kernel.cpp to do this but I felt like forcing this responsibility on the user might be less error prone

Compiles each file targetting 2.9 and asserts that compilation fails. (There are 2 2.9 kernels we use as negative tests where compilation is expected to succeed)

Stack from ghstack (oldest at bottom):


PR Type

Tests, Enhancement


Description

  • Split monolithic kernel.cpp into individual function files for better organization and maintainability

  • Added comprehensive test suite to verify TORCH_FEATURE_VERSION guards are properly used

  • Compiles each .cpp and .cu file with TORCH_TARGET_VERSION=2.9.0 and validates compilation fails

  • Includes negative tests (mv_tensor_accessor_cpu.cpp and mv_tensor_accessor_cuda.cu) to verify test infrastructure correctness


Diagram Walkthrough

flowchart LR
  A["kernel.cpp<br/>monolithic file"] -->|"split into"| B["Individual function files<br/>15+ .cpp/.cu files"]
  B -->|"tested by"| C["test_version_compatibility.py<br/>version guard validator"]
  C -->|"compiles with"| D["TORCH_TARGET_VERSION=2.9.0"]
  D -->|"expects failure"| E["Compilation fails<br/>guards working"]
  D -->|"negative tests"| F["mv_tensor_accessor files<br/>should succeed"]
Loading

File Walkthrough

Relevant files
Tests
3 files
test_version_compatibility.py
Comprehensive test suite for version compatibility validation
+300/-0 
mv_tensor_accessor_cpu.cpp
Added negative test file for CPU tensor accessor compatibility
+40/-0   
mv_tensor_accessor_cuda.cu
Added negative test file for CUDA tensor accessor compatibility
+47/-0   
Refactoring
16 files
kernel.cpp
Removed monolithic kernel file split into individual files
+0/-205 
my__foreach_mul.cpp
Extracted foreach_mul function into separate file               
+20/-0   
my__foreach_mul_.cpp
Extracted foreach_mul_ function into separate file             
+19/-0   
make_tensor_clones_and_call_foreach.cpp
Extracted make_tensor_clones_and_call_foreach function into separate
file
+41/-0   
my_empty.cpp
Extracted my_empty function into separate file                     
+25/-0   
my_reshape.cpp
Extracted my_reshape function into separate file                 
+17/-0   
my_view.cpp
Extracted my_view function into separate file                       
+20/-0   
test_tensor_device.cpp
Extracted test_tensor_device function into separate file 
+17/-0   
test_device_constructor.cpp
Extracted test_device_constructor function into separate file
+37/-0   
test_device_equality.cpp
Extracted test_device_equality function into separate file
+14/-0   
test_device_set_index.cpp
Extracted test_device_set_index function into separate file
+17/-0   
test_device_index.cpp
Extracted test_device_index function into separate file   
+14/-0   
test_device_is_cuda.cpp
Extracted test_device_is_cuda function into separate file
+14/-0   
test_device_is_cpu.cpp
Extracted test_device_is_cpu function into separate file 
+14/-0   
test_parallel_for.cpp
Extracted test_parallel_for function into separate file   
+49/-0   
test_get_num_threads.cpp
Extracted test_get_num_threads function into separate file
+14/-0   
Enhancement
1 files
tensor_accessor_kernel.h
Added shared tensor accessor kernel header for CPU and CUDA
+28/-0   

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No Audit Logs: The new test and helper compilation functions perform critical compile/validation actions
without emitting structured audit logs of who ran them, when, or their outcomes.

Referred Code
def _compile_cpp_file(
    self, source_file: Path, output_file: Path
) -> tuple[bool, str]:
    """
    Compile a C++ file with TORCH_TARGET_VERSION=2.9.0.
    Returns (success, error_message).
    """
    torch_version_2_9 = "0x0209000000000000"

    cmd = [
        "g++",
        "-c",
        "-std=c++17",
        f"-DTORCH_TARGET_VERSION={torch_version_2_9}",
        f"-I{source_file.parent}",  # For includes in same directory
        *self.pytorch_includes,
    ]

    # Add CUDA flags if available
    if self.cuda_available:
        cmd.extend(self.cuda_includes)


 ... (clipped 75 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Timeout Handling: Subprocess compilation uses a fixed 30s timeout without retry or differentiating
tool-not-found vs actual compile errors, which may cause flaky failures without robust
handling.

Referred Code
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)

if result.returncode == 0:
    return True, ""
else:
    return False, result.stderr

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix operator schema to match implementation

Update the operator schema for mv_tensor_accessor_cpu to match its C++ function
signature by removing the extra res argument from the m.def string.

test/cpp_extensions/libtorch_agnostic_2_10_extension/libtorch_agnostic_2_10/csrc/mv_tensor_accessor_cpu.cpp [34-36]

 STABLE_TORCH_LIBRARY_FRAGMENT(libtorch_agnostic_2_10, m) {
-  m.def("mv_tensor_accessor_cpu(Tensor res, Tensor m, Tensor v) -> Tensor");
+  m.def("mv_tensor_accessor_cpu(Tensor m, Tensor v) -> Tensor");
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a mismatch between the operator schema in m.def and the C++ function signature, which would cause a runtime registration error.

Medium
Avoid adding CUDA includes for C++ files

Remove the logic that adds CUDA include paths from the _compile_cpp_file
function to prevent potential compilation issues with standard C++ files.

test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [66-95]

 def _compile_cpp_file(
     self, source_file: Path, output_file: Path
 ) -> tuple[bool, str]:
     """
     Compile a C++ file with TORCH_TARGET_VERSION=2.9.0.
     Returns (success, error_message).
     """
     torch_version_2_9 = "0x0209000000000000"
 
     cmd = [
         "g++",
         "-c",
         "-std=c++17",
         f"-DTORCH_TARGET_VERSION={torch_version_2_9}",
         f"-I{source_file.parent}",  # For includes in same directory
         *self.pytorch_includes,
     ]
 
-    # Add CUDA flags if available
-    if self.cuda_available:
-        cmd.extend(self.cuda_includes)
-
     cmd.extend([str(source_file), "-o", str(output_file)])
 
     result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
 
     if result.returncode == 0:
         return True, ""
     else:
         return False, result.stderr
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that adding CUDA include paths during C++ compilation is unnecessary and potentially problematic, improving the robustness and correctness of the test compilation logic.

Medium
High-level
Avoid duplicating code for negative tests

To avoid maintenance issues from code duplication, modify the test script to
compile the original mv_tensor_accessor files from the
libtorch_agnostic_2_9_extension directory for negative testing, instead of using
copied versions.

Examples:

test/cpp_extensions/libtorch_agnostic_2_10_extension/libtorch_agnostic_2_10/csrc/mv_tensor_accessor_cpu.cpp [1-2]
// This is duplicated from the libtorch_agnostic_2_9_extension
// as a negative test for test_version_compatibility.py
test/cpp_extensions/libtorch_agnostic_2_10_extension/test_version_compatibility.py [163-170]
    def test_mv_tensor_accessor_cpu_works_with_2_9(self):
        """Test that mv_tensor_accessor_cpu.cpp compiles successfully with 2.9.0.

        This is a negative test - it ensures that a file we expect to work with 2.9.0
        actually does compile. This validates that our test infrastructure correctly
        distinguishes between files that require 2.10+ and those that don't.
        """
        cpp_file = self.csrc_dir / "mv_tensor_accessor_cpu.cpp"

Solution Walkthrough:

Before:

// test/.../libtorch_agnostic_2_10/csrc/mv_tensor_accessor_cpu.cpp
// This is duplicated from the libtorch_agnostic_2_9_extension
// as a negative test for test_version_compatibility.py
... (code for mv_tensor_accessor_cpu) ...

// test/.../test_version_compatibility.py
class FunctionVersionCompatibilityTest(TestCase):
    def setUpClass(cls):
        cls.csrc_dir = Path(__file__).parent / "libtorch_agnostic_2_10" / "csrc"
        ...

    def test_mv_tensor_accessor_cpu_works_with_2_9(self):
        # This file is a copy inside the 2_10 directory
        cpp_file = self.csrc_dir / "mv_tensor_accessor_cpu.cpp"
        success, _ = self._compile_cpp_file(cpp_file, ...)
        self.assertTrue(success)

After:

// test/.../libtorch_agnostic_2_10/csrc/mv_tensor_accessor_cpu.cpp
// This file is deleted.

// test/.../test_version_compatibility.py
class FunctionVersionCompatibilityTest(TestCase):
    def setUpClass(cls):
        cls.csrc_dir_2_10 = Path(__file__).parent / "libtorch_agnostic_2_10" / "csrc"
        # Path to the original 2_9 extension files
        cls.csrc_dir_2_9 = Path(__file__).parent.parent / "libtorch_agnostic_2_9_extension" / "csrc"
        ...

    def test_mv_tensor_accessor_cpu_works_with_2_9(self):
        # Reference the original file from the 2_9 directory
        cpp_file = self.csrc_dir_2_9 / "mv_tensor_accessor_cpu.cpp"
        success, _ = self._compile_cpp_file(cpp_file, ...)
        self.assertTrue(success)
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies code duplication for negative tests, which is a significant maintainability issue, and proposes a robust solution that improves the long-term quality of the test suite.

Medium
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants