Skip to content
@mit-han-lab

MIT HAN Lab

Efficient AI Computing. PI: Song Han

Pinned Loading

  1. streaming-llm Public

    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks

    Python 6.9k 386

  2. llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 3.1k 255

  3. efficientvit Public

    Efficient vision foundation models for high-resolution generation and perception.

    Python 2.9k 226

  4. bevfusion Public archive

    [ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

    Python 2.7k 478

  5. temporal-shift-module Public

    [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

    Python 2.1k 420

  6. once-for-all Public

    [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

    Python 1.9k 342

Repositories

Showing 10 of 60 repositories
  • nunchaku Public

    [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

    Python 2,022 Apache-2.0 105 52 7 Updated Jun 16, 2025
  • ComfyUI-nunchaku Public

    ComfyUI plugin of Nunchaku

    Python 1,246 Apache-2.0 35 45 0 Updated Jun 16, 2025
  • llm-awq Public

    [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 3,078 MIT 255 156 12 Updated Jun 12, 2025
  • x-attention Public

    XAttention: Block Sparse Attention with Antidiagonal Scoring

    Python 164 8 3 0 Updated Jun 6, 2025
  • torchquantum Public

    A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.

    Jupyter Notebook 1,489 MIT 223 61 (4 issues need help) 9 Updated May 28, 2025
  • vila-u Public

    [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

    Python 346 MIT 11 17 0 Updated Apr 25, 2025
  • efficientvit Public

    Efficient vision foundation models for high-resolution generation and perception.

    Python 2,915 Apache-2.0 226 106 0 Updated Apr 24, 2025
  • deepcompressor Public

    Model Compression Toolbox for Large Language Models and Diffusion Models

    Python 498 Apache-2.0 36 53 1 Updated Mar 27, 2025
  • omniserve Public

    [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

    C++ 698 Apache-2.0 44 41 4 Updated Mar 6, 2025
  • torchsparse Public

    [MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

    Cuda 1,357 MIT 169 41 3 Updated Feb 24, 2025