Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] <numBlocks in Y dimension is larger than needed for FetchOnDemand_no_fusion> #323

Open
1 task done
yokosyun opened this issue Aug 12, 2024 · 0 comments
Open
1 task done
Assignees

Comments

@yokosyun
Copy link

yokosyun commented Aug 12, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

fetch_on_demand_gemm_no_fusion have wrong numBlocks in Y dim.
Thus there is unnecessary Block execution.

cur_nnz is divided by 16(BLOCK_SIZE)

fetch_on_demand_gemm_no_fusion_fp32_1<16, 4, 8>
            <<<dim3(DIV_UP(out_channel, 16), DIV_UP(cur_nnz, 16), 1),
               dim3(16, 16, 1)>>>

Expected Behavior

it must be divided by (16(BLOCK_SIZE)*4(N_LOOP)) to be correct numBlocks in Y dim

fetch_on_demand_gemm_no_fusion_fp32_1<16, 4, 8>
            <<<dim3(DIV_UP(out_channel, 16), DIV_UP(cur_nnz, 16 * N_LOOP), 1),
               dim3(16, 16, 1)>>>

Environment

- GCC:
- NVCC:
- PyTorch:
- PyTorch CUDA:
- TorchSparse:

Anything else?

We can't make a bugfix PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants