Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel #2472

Open
2 tasks
zhyncs opened this issue Dec 12, 2024 · 3 comments
Open
2 tasks

[Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel #2472

zhyncs opened this issue Dec 12, 2024 · 3 comments
Assignees

Comments

@zhyncs
Copy link
Member

zhyncs commented Dec 12, 2024

Checklist

Motivation

ref
https://github.com/NVIDIA/cutlass/pull/1932/files

Related resources

No response

@zhyncs
Copy link
Member Author

zhyncs commented Dec 12, 2024

note: If the official merge does not occur by the end of the month, we will compile and use based on the sgl-project/cutlass f8_blockwise_scaling_pr_branch branch.
ref https://github.com/sgl-project/cutlass/tree/f8_blockwise_scaling_pr_branch

@HaiShaw
Copy link
Collaborator

HaiShaw commented Dec 12, 2024

@zhyncs this suppose to be NV specific.

@HaiShaw HaiShaw self-assigned this Dec 12, 2024
@zhyncs
Copy link
Member Author

zhyncs commented Dec 13, 2024

@HaiShaw Yeah we will use it for some model on NVIDIA H100 and H200

@zhyncs zhyncs changed the title [Feature] Integrate FP8 GEMM into sgl-kernel [Feature] Integrate CUTLASS FP8 GEMM into sgl-kernel Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants