Skip to content

Conversation

@yucai-intel
Copy link
Contributor

Fixed the following issues found by test/test_nn.py::TestNNDeviceTypeXPU::test_nll_loss_large_tensor_reduction_mean_xpu and test_nll_loss_large_tensor_reduction_sum_xpu

  1. Segmentation faults caused by pointer type conversion errors that result in invalid memory addresses.
  2. Kernel call errors caused by incorrect judgment conditions.

@yucai-intel
Copy link
Contributor Author

issue link #2008

@yucai-intel
Copy link
Contributor Author

image

@CuiYifeng CuiYifeng requested a review from Copilot September 30, 2025 01:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes segmentation faults and kernel call errors in the NLLLoss kernel implementation for XPU devices. The changes refactor the kernel functors to use safer memory access patterns and more consistent parameter ordering.

Key changes include:

  • Complete rewrite of kernel functors with improved memory safety and bounds checking
  • Simplified function signatures with reordered parameters for better consistency
  • Addition of proper index validation and overflow protection

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
src/ATen/native/xpu/sycl/LossNLLKernel.h Updated function signatures to reorder parameters and use consistent naming
src/ATen/native/xpu/sycl/LossNLLKernel.cpp Major refactor of kernel implementations with improved memory safety and bounds checking
src/ATen/native/xpu/sycl/KernelUtils.h Added utility constants and functions for kernel execution
src/ATen/native/xpu/LossNLL.cpp Updated function calls to match new kernel signatures

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@yucai-intel
Copy link
Contributor Author

yucai-intel commented Oct 15, 2025

Pref
image

Copy link
Contributor

@guangyey guangyey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@CuiYifeng CuiYifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main part of this PR looks good to me.

@guangyey guangyey enabled auto-merge October 24, 2025 01:37
@guangyey guangyey added this pull request to the merge queue Oct 24, 2025
Merged via the queue into main with commit 87d9beb Oct 24, 2025
25 checks passed
@guangyey guangyey deleted the yucai/nll/fix branch October 24, 2025 01:38
@CuiYifeng CuiYifeng mentioned this pull request Oct 24, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants