-
Notifications
You must be signed in to change notification settings - Fork 73
CMake Dependency Validation & Reporting #5740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…d basic constraint requirements.
|
Review updated until commit a2c9053 Description
|
| Relevant files | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Enhancement | 27 files
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Additional files | 1 files
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| 🔒 Security concerns No - The changes are primarily build system enhancements with no security implications. The JSON export includes proper escaping, and the Python script only reads JSON and prints formatted output without executing any external commands or processing untrusted input. |
| ⚡ Recommended focus areas for review |
CUDA Constraint Validation
|
Greptile SummaryThis PR implements a comprehensive CMake dependency validation and reporting system that transforms cryptic build errors into actionable guidance for users. The implementation is well-architected with clear separation of concerns: CMake handles validation logic while Python provides user-friendly formatting and platform-specific installation instructions. Key Improvements
Implementation QualityThe code demonstrates strong engineering practices:
Previous Review CommentsAll previously identified issues have been addressed or acknowledged by the development team. The main feedback points were style-related (JSON escaping completeness, variable scope in compiler.py, etc.) and have either been resolved or deemed acceptable trade-offs. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant CMake as CMake Build System
participant LogCapture as LogCapture.cmake
participant Handlers as Dependency Handlers
participant Utils as DependencyUtilities.cmake
participant Python as check_dependencies.py
participant Requirements as Requirement Classes
User->>CMake: Run cmake ..
CMake->>CMake: Load DependencyRequirements.cmake (version constraints)
CMake->>CMake: Load DependencyUtilities.cmake (status tracking)
CMake->>LogCapture: Load LogCapture.cmake (message override)
CMake->>CMake: Include all handler files
alt NVFUSER_ENABLE_DEPENDENCY_REPORT=ON
CMake->>LogCapture: start_capture()
LogCapture->>LogCapture: Set LOG_CAPTURE_MODE=TRUE
CMake->>Handlers: handle_compiler()
Handlers->>CMake: Check compiler version vs requirements
Handlers->>Utils: set_dependency_report_status(Compiler)
Utils->>Utils: Compare version, set status variables
Note over Handlers,LogCapture: Messages buffered, not printed
CMake->>Handlers: handle_python(), handle_cuda_toolkit()...
Handlers->>Utils: set_dependency_report_status() for each
Utils->>Utils: Update NVFUSER_DEPENDENCIES_OK flag
CMake->>Handlers: handle_torch()
Handlers->>Handlers: Find PyTorch via Python
Handlers->>Utils: set_dependency_report_status(Torch)
Handlers->>Handlers: Query torch.version.cuda
Handlers->>Handlers: Compare Torch CUDA vs CUDAToolkit version
Handlers->>Utils: Set Torch_CUDA_constraint_status
CMake->>LogCapture: stop_capture(DEP_LOGS)
LogCapture->>LogCapture: Set LOG_CAPTURE_MODE=FALSE
LogCapture->>CMake: Return buffered logs
CMake->>Utils: export_dependency_json()
Utils->>Utils: Write all CMake vars to JSON
CMake->>Python: Execute check_dependencies.py <json_file>
Python->>Python: Load cmake_vars from JSON
Python->>Requirements: Create requirement objects
Requirements->>Requirements: Extract status/version from cmake_vars
Python->>User: Print colored dependency report
alt Any failures found
Requirements->>Requirements: is_failure() checks
Requirements->>User: generate_help() with install instructions
end
Python->>CMake: Exit (status 0, doesn't fail build)
CMake->>LogCapture: dump_captured_logs(DEP_LOGS)
LogCapture->>User: Print detailed CMake output
alt NVFUSER_DEPENDENCIES_OK=FALSE
CMake->>User: FATAL_ERROR with instructions
end
else NVFUSER_ENABLE_DEPENDENCY_REPORT=OFF
CMake->>Handlers: handle_compiler() etc.
Handlers->>CMake: find_package(...REQUIRED)
Note over CMake,User: Fails immediately on first error
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 3 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 11 comments
4bbd913 to
4331d77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 7 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 files reviewed, 1 comment
|
!build |
| # All rights reserved. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| cmake_minimum_required(VERSION 3.18 FATAL_ERROR) | ||
| cmake_minimum_required(VERSION 3.18) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FATAL_ERROR here has been ignored since v3.6, this is always a hard requirement
| # We try to update which C++ standard we use together in lockstep across all | ||
| # built libraries, and these variables control which that is. Generally we are | ||
| # on C++20, but we still support a version of CUDA (11) that does not recognize | ||
| # C++20 and so we drop back to 17 there. Also, we allow all of these to be | ||
| # overridden by the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xwang233 is this comment still correct? is there a version of CUDA we support that does not recognize c++20 still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
28 files reviewed, 4 comments
| set(cuda_toolkit_version "${CUDAToolkit_VERSION_MAJOR}.${CUDAToolkit_VERSION_MINOR}") | ||
|
|
||
| # Extract major.minor from Torch CUDA version | ||
| string(REGEX MATCH "^([0-9]+\\.[0-9]+)" torch_cuda_major_minor "${torch_cuda_version}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: If torch_cuda_version has unexpected format (e.g., no dots), torch_cuda_major_minor will be empty, causing silent comparison failure.
Add validation:
if(NOT torch_cuda_major_minor)
set(Torch_CUDA_constraint_status "not_available")
return()
endif()| if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU") | ||
| set(NVFUSER_REQUIREMENT_Compiler_VERSION_MIN ${NVFUSER_REQUIREMENT_GNU_VERSION_MIN}) | ||
| elseif(CMAKE_CXX_COMPILER_ID STREQUAL "Clang") | ||
| set(NVFUSER_REQUIREMENT_Compiler_VERSION_MIN ${NVFUSER_REQUIREMENT_Clang_VERSION_MIN}) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Compilers other than GNU/Clang (e.g., MSVC, Intel) will have unset NVFUSER_REQUIREMENT_Compiler_VERSION_MIN, causing version check to pass with SUCCESS status.
The OPTIONAL=TRUE on line 23 makes this non-fatal, but users might not notice they're using an unsupported compiler. Consider adding an explicit check for supported compiler IDs.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
!build |
Summary
This PR aims to aid new users in setting up and installing nvFuser successfully. This is done by providing user a comprehensive python report (based on #5609) as early as possible in the build process:
Differences (This PR vs #5609)
The outcome of the report is determined by CMake's evaluation of the constraints we place on requirements. The report has no effect on the ability to build nvFuser. All failure logic is define by the CMake system.
The report scripts are used to aid in formatting and printing pertinent information to the user. This is done by directly referencing CMake variables in python and allowing python to handle complicated string manipulation and formatting (which CMake is really bad at...).
The contents of the help messages largely remains the same as #5609. Giving user guidance based on their build platform.
CMake Changes
cmake/DependencyRequirements.cmakeis the single source of truth for version requirements, components and the state ofOPTIONALfor each dependency.NVFUSER_ENABLE_DEPENDENCY_REPORTis by defaultON. If this is setOFFthen dependencies will be evaluated as "normal" in CMake and the build configuration will exit of the first failure.cmake/deps/handle_<name>.cmakefile for some organization/clarity.Success Case
LANGUAGESthe project is built for first - this cant be skipped AFAIK.NVFUSER_ENABLE_DEPENDECY_REPORT=Off)Failure Case (example : pybind11 version too low)
Report fails with installation instructions for users.
FATAL_ERRORwhen pybind11 mismatches.