🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
-
Updated
May 15, 2025 - Python
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
Distributed RL System for LLM Reasoning
[TMLR 2024] Efficient Large Language Models: A Survey
Infrastructures™ for Machine Learning Training/Inference in Production.
Curated collection of papers in machine learning systems
Learn how to design and implement effective Machine Learning systems from start to finish.
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
Dive into machine learning system, start from reinventing the wheel.
Oort: Efficient Federated Learning via Guided Participant Selection
a curated list of high-quality papers on resource-efficient LLMs 🌱
Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and other interesting stuffs).
[TMLR 2025] Efficient Diffusion Models: A Survey
Triton implement of bi-directional (non-causal) linear attention
Machine Learning Compiler Road Map
CSCE 585 - Machine Learning Systems
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
A C++ implementation of the scalar-valued autograd engine micrograd
A curated list of resources to deep dive into the intersection of applied machine learning and threat detection.
[Long Term Support] [SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference
Add a description, image, and links to the machine-learning-systems topic page so that developers can more easily learn about it.
To associate your repository with the machine-learning-systems topic, visit your repo's landing page and select "manage topics."