-
Notifications
You must be signed in to change notification settings - Fork 55
Pull requests: NVIDIA/Fuser
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Tensor-parallelize the DeepSeek V3 transformer layer
#4062
opened Mar 12, 2025 by
wujingyue
Loading…
add register count checks for warp specialization with register sharing
#4061
opened Mar 11, 2025 by
liqiangxl
Loading…
project persistent buffer to it producer if the buffer is the output of an upcast
#4051
opened Mar 9, 2025 by
liqiangxl
Loading…
getOutputShardings checks all TVs to decide single-GPU vs multi-GPU
#4046
opened Mar 7, 2025 by
wujingyue
Loading…
[WIP] simple L2 model for setting grid swizzle and cta order
#4044
opened Mar 7, 2025 by
jacobhinkle
•
Draft
Don't revert upcast for persistent schedulers to avoid increasing persistent buffer sizes
#4040
opened Mar 7, 2025 by
liqiangxl
Loading…
Automatically save MatmulParams in extra_info in benchmarks
#4031
opened Mar 6, 2025 by
jacobhinkle
Loading…
Add nvfuser benchmark executor and unify test_matmul.py
#4021
opened Mar 6, 2025 by
jacobhinkle
•
Draft
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.