Skip to content
View jason-huang03's full-sized avatar
  • Tsinghua University, NVIDIA
  • Beijing, China

Highlights

  • Pro

Organizations

@thu-nics @thu-ml

Block or report jason-huang03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. thu-ml/SageAttention thu-ml/SageAttention Public

    Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

    Cuda 586 27

  2. SPH_Project SPH_Project Public

    SPH Realization of Fluid Simulation. Featuring Large Scale Simulation, Rigid-Fluid Coupling and High Viscosity Fluid.

    Python 138 10

  3. thu-nics/MoA thu-nics/MoA Public

    The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

    Python 100 6

  4. mit-han-lab/qserve mit-han-lab/qserve Public

    QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

    Python 455 25