Skip to content

harvard-edge/cs249r_book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MACHINE LEARNING SYSTEMS

Principles and Practices of Engineering Artificially Intelligent Systems

Build Status πŸ“– Website 🌐 Ecosystem Last Commit License Open Collective


🎯 What Is This?

The open-source textbook that teaches you to build real-world AI systems β€” from edge devices to cloud deployment. Started as a Harvard University course (CS249r) by Prof. Vijay Janapa Reddi, now used by universities and students worldwide.

Our mission: Expand access to AI systems education worldwide β€” empowering learners, one chapter and one lab at a time.

πŸ’­ Why This Exists

"This grew out of a concern that while students could train AI models, few understood how to build the systems that actually make them work. It's like everyone can write an app, but few know how to build the smartphone that runs it. As AI becomes more capable and autonomous, the critical bottleneck won't be the algorithms - it will be the engineers who can build efficient, scalable, and sustainable systems that safely harness that intelligence. Richard Sutton's "The Bitter Lesson" taught us that general methods leveraging computation ultimately triumph over human-crafted approaches. The same principle applies here: the future belongs to those who can engineer the systems that unlock AI's computational potential. We're at an inflection point where we need an entirely new discipline - AI Engineering - focused not just on training models, but on the full systems stack that makes AI work in the real world. This book is my attempt to establish that foundation. We can't make this transformation happen overnight, but it has to start somewhere."

β€” Vijay


πŸ“š GET STARTED

πŸŽ“ For Learners

πŸ‘©β€πŸ« For Educators

πŸ› οΈ For Contributors


🧠 What You'll Learn

We go beyond training models β€” this book teaches you to understand and build the full stack of real-world ML systems.

Core Topics:

  • ML system design & architecture β€” building scalable, maintainable systems
  • Data pipelines & engineering β€” collection, labeling, and processing at scale
  • Model optimization & deployment β€” from prototypes to production
  • MLOps & monitoring β€” keeping systems running reliably
  • Edge AI & resource constraints β€” deploying on mobile, embedded, and IoT devices

πŸš€ Quick Start

For Readers

# View the book online
open https://mlsysbook.ai

For Contributors

# Clone and setup
git clone https://github.com/harvard-edge/cs249r_book.git
cd cs249r_book
make setup-hooks  # Setup automated quality controls
make install      # Install dependencies

# Daily development
make clean build  # Clean and build
make preview      # Start development server

🀝 Contributing

We welcome contributions from students, educators, researchers, and practitioners worldwide.

Ways to Contribute

  • πŸ“ Content: Suggest edits, improvements, or new examples
  • πŸ› οΈ Tools: Enhance development scripts and automation
  • 🎨 Design: Improve figures, diagrams, and visual elements
  • 🌍 Localization: Translate or adapt content for local needs
  • πŸ”§ Infrastructure: Help with build systems and deployment

Getting Started

  1. Read: contribute.md for detailed guidelines
  2. Setup: Follow the development workflow above
  3. Explore: Check existing GitHub Issues
  4. Connect: Join GitHub Discussions

Quality Standards

All contributions go through automated quality checks:

  • βœ… Pre-commit validation: Automatic cleanup and checks
  • πŸ“‹ Content review: Formatting and style validation
  • πŸ§ͺ Testing: Automated build and link verification
  • πŸ‘₯ Peer review: Community and maintainer feedback

⭐ Support This Work

Show this matters: If you find this valuable, please star this repository ⭐ β€” it signals to institutions and funding bodies that open AI education matters.

Fund the mission: Help us expand AI systems education globally. You can sponsor TinyML kits for students in developing countries, fund learning materials, support workshops, or sustain our open-source infrastructure.

Open Collective

From $15/month to sponsor a learner, to $250 for a hands-on workshop β€” every contribution democratizes AI systems education worldwide.


🌐 Learn More


πŸ› οΈ Development Workflow & Technical Details

Development Workflow

This project features a modern, automated development workflow with quality controls and organized tooling.

⚑ Quick Commands

# Building
make build          # Build HTML version
make build-pdf      # Build PDF version  
make preview        # Start development server

# Quality Control
make clean          # Clean build artifacts
make test           # Run validation tests
make lint           # Check for issues
make check          # Project health check

# Get help
make help           # Show all commands

πŸ”§ Automated Quality Controls

  • 🧹 Pre-commit hooks: Automatically clean build artifacts before commits
  • πŸ“‹ Linting: Check for formatting and content issues
  • βœ… Validation: Verify project structure and dependencies
  • πŸ” Testing: Automated tests for content and scripts
  • πŸ—‚οΈ Organized Structure: Professional script organization with clear categories

πŸ—‚οΈ Organized Development Tools

Our development tools are organized into logical categories:

tools/scripts/
β”œβ”€β”€ build/           # Build and development scripts
β”œβ”€β”€ content/         # Content management tools
β”œβ”€β”€ maintenance/     # System maintenance scripts
β”œβ”€β”€ testing/         # Test and validation scripts
β”œβ”€β”€ utilities/       # General utility scripts
└── docs/            # Comprehensive documentation

Each category includes focused tools with clear naming and documentation. See tools/scripts/README.md for details.

πŸ“– Documentation

πŸ”§ Build the Book Locally

Prerequisites

  • Quarto (latest version)
  • Python 3.8+ with pip
  • Git

Quick Build

# Clone the repository
git clone https://github.com/harvard-edge/cs249r_book.git
cd cs249r_book

# Setup development environment
make setup-hooks  # Configure git hooks
make install      # Install dependencies

# Build and preview (runs from book/ directory)
make clean build  # Clean and build HTML
make preview      # Start development server

Advanced Development

# Full development setup
make clean-deep      # Deep clean
make install         # Install all dependencies
make build-all       # Build all formats (HTML, PDF, EPUB)

# Continuous development
make preview         # Auto-reload development server
make test            # Run validation tests
make lint            # Check content quality

See DEVELOPMENT.md for the complete development guide.

πŸ“Š Project Structure

MLSysBook/
β”œβ”€β”€ book/                    # Main book content (Quarto)
β”‚   β”œβ”€β”€ contents/            # Chapter content
β”‚   β”‚   β”œβ”€β”€ core/            # Core chapters
β”‚   β”‚   β”œβ”€β”€ labs/            # Hands-on labs
β”‚   β”‚   β”œβ”€β”€ frontmatter/     # Preface, acknowledgments
β”‚   β”‚   └── parts/           # Book parts and sections
β”‚   β”œβ”€β”€ _quarto.yml          # Book configuration
β”‚   β”œβ”€β”€ index.qmd            # Main entry point
β”‚   └── assets/              # Images, styles, media
β”œβ”€β”€ tools/                   # Development automation
β”‚   β”œβ”€β”€ scripts/             # Organized development scripts
β”‚   β”‚   β”œβ”€β”€ build/           # Build and development tools
β”‚   β”‚   β”œβ”€β”€ content/         # Content management tools
β”‚   β”‚   β”œβ”€β”€ maintenance/     # System maintenance scripts
β”‚   β”‚   β”œβ”€β”€ testing/         # Test and validation scripts
β”‚   β”‚   β”œβ”€β”€ utilities/       # General utility scripts
β”‚   β”‚   └── docs/            # Script documentation
β”‚   β”œβ”€β”€ dependencies/        # Package requirements  
β”‚   └── setup/               # Setup and configuration
β”œβ”€β”€ config/                  # Build configuration
β”‚   β”œβ”€β”€ _extensions/         # Quarto extensions
β”‚   β”œβ”€β”€ lua/                 # Lua scripts
β”‚   └── tex/                 # LaTeX templates
β”œβ”€β”€ assets/                  # Global assets (covers, icons)
β”œβ”€β”€ DEVELOPMENT.md           # Development guide
β”œβ”€β”€ MAINTENANCE_GUIDE.md     # Daily workflow guide
β”œβ”€β”€ Makefile                 # Development commands
└── README.md                # This file

🎯 Features

  • πŸš€ Modern Development Workflow: Automated builds, quality checks, and deployment
  • πŸ—‚οΈ Organized Tooling: Professional script organization with comprehensive documentation
  • πŸ”§ Easy Contribution: One-command setup with automated quality controls
  • πŸ“š Comprehensive Docs: Detailed guides for development, building, and contribution
  • 🌐 Multi-format Output: HTML, PDF, and EPUB with consistent styling
  • ⚑ Fast Iteration: Live preview server with automatic reloading
  • βœ… Quality Assurance: Automated testing, linting, and validation
  • πŸ“ Clean Architecture: Well-organized project structure with clear separation of concerns
  • πŸ› οΈ Professional Tooling: Category-based script organization for easy maintenance
πŸ“‹ Project Information

πŸ“– Citation

@inproceedings{reddi2024mlsysbook,
  title        = {MLSysBook.AI: Principles and Practices of Machine Learning Systems Engineering},
  author       = {Reddi, Vijay Janapa},
  booktitle    = {2024 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS)},
  pages        = {41--42},
  year         = {2024},
  organization = {IEEE},
  url          = {https://mlsysbook.org},
  note         = {Available at: https://mlsysbook.org}
}

πŸ›‘οΈ License

This work is licensed under a Creative Commons Attribution–NonCommercial–ShareAlike 4.0 International License (CC BY-NC-SA 4.0)

You may share and adapt the material for non-commercial purposes, with appropriate credit and under the same license.