Skip to content

TinyπŸ”₯Torch is a minimalist framework for building machine learning systems from scratchβ€”from tensors to systems.

Notifications You must be signed in to change notification settings

MLSysBook/TinyTorch

Repository files navigation

TinyπŸ”₯Torch

Build your own ML framework. Start small. Go deep.

Work in Progress Educational Project GitHub Python Binder Jupyter Book

πŸ“š Read the Interactive Course β†’


πŸ—οΈ The Big Picture: Why Build from Scratch?

Most ML education teaches you to use frameworks. TinyTorch teaches you to build them.

Traditional ML Course:          TinyTorch Approach:
β”œβ”€β”€ import torch               β”œβ”€β”€ class Tensor:
β”œβ”€β”€ model = nn.Linear(10, 1)   β”‚     def __add__(self, other): ...
β”œβ”€β”€ loss = nn.MSELoss()        β”‚     def backward(self): ...
└── optimizer.step()           β”œβ”€β”€ class Linear:
                               β”‚     def forward(self, x):
                               β”‚       return x @ self.weight + self.bias
                               β”œβ”€β”€ def mse_loss(pred, target):
                               β”‚     return ((pred - target) ** 2).mean()
                               β”œβ”€β”€ class SGD:
                               β”‚     def step(self):
                               └──     param.data -= lr * param.grad

Go from "How does this work?" 🀷 to "I implemented every line!" πŸ’ͺ

Result: You become the person others come to when they need to understand "how PyTorch actually works under the hood."


🌟 What Makes TinyTorch Different

πŸ”¬ Build-First Philosophy

  • No black boxes: Implement every component from scratch
  • Immediate ownership: Use YOUR code in real neural networks
  • Deep understanding: Know exactly how each piece works

πŸš€ Real Production Skills

  • Professional workflow: Development with tito CLI, automated testing
  • Real datasets: Train on CIFAR-10, not toy data
  • Production patterns: MLOps, monitoring, optimization from day one

🎯 Progressive Mastery

  • Start simple: Implement hello_world() function
  • Build systematically: Each module enables the next
  • End powerful: Deploy production ML systems with monitoring

⚑ Instant Feedback

  • Code works immediately: No waiting to see results
  • Visual progress: Success indicators and system integration
  • "Aha moments": Watch your ReLU power real neural networks

🎯 What You'll Build

  • One Complete ML Framework β€” Not 14 separate exercises, but integrated components building into your own PyTorch-style toolkit
  • Fully Functional System β€” Every piece connects: your tensors power your layers, your autograd enables your optimizers, your framework trains real networks
  • Real Applications β€” Train neural networks on CIFAR-10 using 100% your own code, no PyTorch imports
  • Production-Ready Skills β€” Complete ML lifecycle: data loading, training, optimization, deployment, monitoring
  • Deep Systems Understanding β€” Know exactly how every component works and integrates because you built it all

πŸš€ Quick Start (2 minutes)

πŸ§‘β€πŸŽ“ Students

git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch
pip install -r requirements.txt           # Install all dependencies (numpy, jupyter, pytest, etc.)
pip install -e .                          # Install TinyTorch package in editable mode
tito system doctor                         # Verify your setup
cd modules/source/01_setup
jupyter lab setup_dev.py                  # Launch your first module

πŸ‘©β€πŸ« Instructors

# System check
tito system info
tito system doctor

# Module workflow
tito export 01_setup
tito test 01_setup
tito nbdev build                          # Update package

πŸ“ Repository Structure

TinyTorch/
β”œβ”€β”€ modules/source/           # 16 educational modules
β”‚   β”œβ”€β”€ 01_setup/            # Development environment setup
β”‚   β”‚   β”œβ”€β”€ module.yaml      # Module metadata
β”‚   β”‚   β”œβ”€β”€ README.md        # Learning objectives and guide
β”‚   β”‚   └── setup_dev.py     # Implementation file
β”‚   β”œβ”€β”€ 02_tensor/           # N-dimensional arrays
β”‚   β”‚   β”œβ”€β”€ module.yaml
β”‚   β”‚   β”œβ”€β”€ README.md
β”‚   β”‚   └── tensor_dev.py
β”‚   β”œβ”€β”€ 03_activations/      # Neural network activation functions
β”‚   β”œβ”€β”€ 04_layers/           # Dense layers and transformations
β”‚   β”œβ”€β”€ 05_dense/            # Sequential networks and MLPs
β”‚   β”œβ”€β”€ 06_spatial/          # Convolutional neural networks
β”‚   β”œβ”€β”€ 07_attention/        # Self-attention and transformer components
β”‚   β”œβ”€β”€ 08_dataloader/       # Data loading and preprocessing
β”‚   β”œβ”€β”€ 09_autograd/         # Automatic differentiation
β”‚   β”œβ”€β”€ 10_optimizers/       # SGD, Adam, learning rate scheduling
β”‚   β”œβ”€β”€ 11_training/         # Training loops and validation
β”‚   β”œβ”€β”€ 12_compression/      # Model optimization and compression
β”‚   β”œβ”€β”€ 13_kernels/          # High-performance operations
β”‚   β”œβ”€β”€ 14_benchmarking/     # Performance analysis and profiling
β”‚   β”œβ”€β”€ 15_mlops/            # Production monitoring and deployment
β”‚   └── 16_capstone/         # Systems engineering capstone project
β”œβ”€β”€ tinytorch/               # Your built framework package
β”‚   β”œβ”€β”€ core/                # Core implementations (exported from modules)
β”‚   β”‚   β”œβ”€β”€ tensor.py        # Generated from 02_tensor
β”‚   β”‚   β”œβ”€β”€ activations.py   # Generated from 03_activations
β”‚   β”‚   β”œβ”€β”€ layers.py        # Generated from 04_layers
β”‚   β”‚   β”œβ”€β”€ dense.py         # Generated from 05_dense
β”‚   β”‚   β”œβ”€β”€ spatial.py       # Generated from 06_spatial
β”‚   β”‚   β”œβ”€β”€ attention.py     # Generated from 07_attention
β”‚   β”‚   └── ...              # All your implementations
β”‚   └── utils/               # Shared utilities and tools
β”œβ”€β”€ book/                    # Interactive course website
β”‚   β”œβ”€β”€ _config.yml          # Jupyter Book configuration
β”‚   β”œβ”€β”€ intro.md             # Course introduction
β”‚   └── chapters/            # Generated from module READMEs
β”œβ”€β”€ tito/                    # CLI tool for development workflow
β”‚   β”œβ”€β”€ commands/            # Student and instructor commands
β”‚   └── tools/               # Testing and build automation
└── tests/                   # Integration tests

How It Works:

  1. Develop in modules/source/ - Each module has a *_dev.py file where you implement components
  2. Export to tinytorch/ - Use tito export to build your implementations into a real Python package
  3. Use your framework - Import and use your own code: from tinytorch.core.tensor import Tensor
  4. Test everything - Run tito test to verify your implementations work correctly
  5. Build iteratively - Each module builds on previous ones, creating a complete ML framework

πŸ“š Complete Course: 16 Modules

Difficulty Progression: ⭐ Beginner β†’ ⭐⭐ Intermediate β†’ ⭐⭐⭐ Advanced β†’ ⭐⭐⭐⭐ Expert β†’ ⭐⭐⭐⭐⭐πŸ₯· Capstone

πŸ—οΈ Foundations (Modules 01-05)

  • 01_setup: Development environment and CLI tools
  • 02_tensor: N-dimensional arrays and tensor operations
  • 03_activations: ReLU, Sigmoid, Tanh, Softmax functions
  • 04_layers: Dense layers and matrix operations
  • 05_dense: Sequential networks and MLPs

🧠 Deep Learning (Modules 06-10)

  • 06_spatial: Convolutional neural networks and image processing
  • 07_attention: Self-attention and transformer components
  • 08_dataloader: Data loading, batching, and preprocessing
  • 09_autograd: Automatic differentiation and backpropagation
  • 10_optimizers: SGD, Adam, and learning rate scheduling

⚑ Systems & Production (Modules 11-15)

  • 11_training: Training loops, metrics, and validation
  • 12_compression: Model pruning, quantization, and distillation
  • 13_kernels: Performance optimization and custom operations
  • 14_benchmarking: Profiling, testing, and performance analysis
  • 15_mlops: Monitoring, deployment, and production systems

πŸŽ“ Capstone Project (Module 16)

  • 16_capstone: Advanced framework engineering specialization tracks

Status: All 16 modules complete with inline tests and educational content


πŸ”— Complete System Integration

This isn't 16 isolated assignments. Every component you build integrates into one cohesive, fully functional ML framework:

flowchart TD
    A[01_setup<br/>Setup & Environment] --> B[02_tensor<br/>Core Tensor Operations]
    B --> C[03_activations<br/>ReLU, Sigmoid, Tanh]
    C --> D[04_layers<br/>Dense Layers]
    D --> E[05_dense<br/>Sequential Networks]
    
    E --> F[06_spatial<br/>Convolutional Networks]
    E --> G[07_attention<br/>Self-Attention]
    F --> H[08_dataloader<br/>Data Loading]
    B --> I[09_autograd<br/>Automatic Differentiation]
    I --> J[10_optimizers<br/>SGD & Adam]
    
    H --> K[11_training<br/>Training Loops]
    G --> K
    J --> K
    
    K --> L[12_compression<br/>Model Optimization]
    K --> M[13_kernels<br/>High-Performance Ops]
    K --> N[14_benchmarking<br/>Performance Analysis]
    K --> O[15_mlops<br/>Production Monitoring]
    
    L --> P[16_capstone<br/>Framework Engineering]
    M --> P
    N --> P
    O --> P
Loading

🎯 How It All Connects

Foundation (01-05): Build your core data structures and basic operations
Deep Learning (06-10): Add neural networks and automatic differentiation
Production (11-15): Scale to real applications with training and production systems
Mastery (16): Optimize and extend your complete framework

The Result: A complete, working ML framework built entirely by you, capable of:

  • βœ… Training CNNs on CIFAR-10 with 90%+ accuracy
  • βœ… Implementing modern optimizers (Adam, learning rate scheduling)
  • βœ… Deploying compressed models with 75% size reduction
  • βœ… Production monitoring with comprehensive metrics

πŸš€ Capstone: Optimize Your Framework

After completing the 15 core modules, you have a complete ML framework. The final challenge: make it better through systems engineering.

Choose Your Focus:

  • ⚑ Performance Engineering: GPU kernels, vectorization, memory-efficient operations
  • 🧠 Algorithm Extensions: Transformer layers, BatchNorm, Dropout, advanced optimizers
  • πŸ”§ Systems Optimization: Multi-GPU training, distributed computing, memory profiling
  • πŸ“Š Benchmarking Analysis: Compare your framework to PyTorch, identify bottlenecks
  • πŸ› οΈ Developer Tools: Better debugging, visualization, error messages, testing

The Constraint: No import torch allowed. Build on your TinyTorch implementation. This demonstrates true mastery of ML systems engineering and optimization.


🧠 Pedagogical Framework: Build β†’ Use β†’ Reflect

Example: How You'll Master Activation Functions

πŸ”§ Build: Implement ReLU from scratch

def relu(x):
    # YOU implement this function
    return ???  # What should this be?

πŸš€ Use: Immediately use your own code

from tinytorch.core.activations import ReLU  # YOUR implementation!
layer = ReLU()
output = layer.forward(input_tensor)  # Your code working!

πŸ’‘ Reflect: See it working in real networks

# Your ReLU is now part of a real neural network
model = Sequential([
    Dense(784, 128),
    ReLU(),           # <-- Your implementation
    Dense(128, 10)
])

This pattern repeats for every component β€” you build it, use it immediately, then see how it fits into larger systems.


πŸŽ“ Teaching Philosophy

No Black Boxes

  • Build every component from scratch
  • Understand performance trade-offs
  • See how engineering decisions impact ML outcomes

Production-Ready Thinking

  • Use real datasets (CIFAR-10, MNIST)
  • Implement proper testing and benchmarking
  • Learn MLOps and system design principles

Iterative Mastery

  • Each module builds on previous work
  • Immediate feedback through inline testing
  • Progressive complexity with solid foundations

πŸ“– Documentation

Interactive Jupyter Book

  • Live Site: https://mlsysbook.github.io/TinyTorch/
  • Auto-updated from source code on every release
  • Complete course content with executable examples
  • Real implementation details with solution code

Development Workflow

  • dev branch: Active development and experiments
  • main branch: Stable releases that trigger documentation deployment
  • Inline testing: Tests embedded directly in source modules
  • Continuous integration: Automatic building and deployment

πŸ› οΈ Development Workflow

Module Development

# Work on dev branch
git checkout dev

# Edit source modules  
cd modules/source/02_tensor
jupyter lab tensor_dev.py

# Export to package
tito export 02_tensor

# Test your implementation
tito test 02_tensor

# Build complete package
tito nbdev build

Release Process

# Ready for release
git checkout main
git merge dev
git push origin main        # Triggers documentation deployment

πŸ“ Project Structure

TinyTorch/
β”œβ”€β”€ modules/source/XX/               # 16 source modules with inline tests
β”œβ”€β”€ tinytorch/core/                  # Your exported ML framework
β”œβ”€β”€ tito/                           # CLI and course management tools
β”œβ”€β”€ book/                           # Jupyter Book source and config
β”œβ”€β”€ tests/                          # Integration tests
└── docs/                           # Development guides and workflows

πŸ§ͺ Tech Stack

  • Python 3.8+ β€” Modern Python with type hints
  • NumPy β€” Numerical foundations
  • Jupyter Lab β€” Interactive development
  • Rich β€” Beautiful CLI output
  • NBDev β€” Literate programming and packaging
  • Jupyter Book β€” Interactive documentation
  • GitHub Actions β€” Continuous integration and deployment

βœ… Verified Learning Outcomes

Students who complete TinyTorch can:

βœ… Build complete neural networks from tensors to training loops
βœ… Implement modern ML algorithms (Adam, dropout, batch norm)
βœ… Optimize performance with profiling and custom kernels
βœ… Deploy production systems with monitoring and MLOps
βœ… Debug and test ML systems with proper engineering practices
βœ… Understand trade-offs between accuracy, speed, and resources


πŸƒβ€β™€οΈ Getting Started

Option 1: Interactive Course

πŸ‘‰ Start Learning Now β€” Complete course in your browser

Option 2: Local Development

git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch
pip install -r requirements.txt           # Install all dependencies (numpy, jupyter, pytest, etc.)
pip install -e .                          # Install TinyTorch package in editable mode  
tito system doctor
cd modules/source/01_setup
jupyter lab setup_dev.py

Option 3: Instructor Setup

# Clone and verify system
git clone https://github.com/mlsysbook/TinyTorch.git
cd TinyTorch
tito system info

# Test module workflow
tito export 01_setup && tito test 01_setup

πŸ”₯ Ready to build your ML framework? Start with TinyTorch and understand every layer. Start Small. Go Deep.


❓ Frequently Asked Questions

πŸš€ "Why not just use PyTorch/TensorFlow? This seems like reinventing the wheel."

You're right - for production, use PyTorch! But consider:

πŸ€” Deep Understanding Questions:

  • Do you understand what loss.backward() actually does? Most engineers don't.
  • Can you debug when gradients vanish? You'll know why and how to fix it.
  • Could you optimize a custom operation? You'll have built the primitives.

πŸ’‘ The Learning Analogy:
Think of it like this: Pilots learn in small planes before flying 747s. You're learning the fundamentals that make you a better PyTorch engineer.


⚑ "How is this different from online tutorials that build neural networks?"

Most tutorials focus on isolated components - a Colab here, a notebook there. TinyTorch builds a fully integrated system.

πŸ—οΈ Systems Engineering Analogy:
Think of building a compiler or operating system. You don't just implement a lexer or a scheduler - you build how every component works together. Each piece must integrate seamlessly with the whole.

πŸ“Š Component vs. System Approach:

Component Approach:          Systems Approach (TinyTorch):
β”œβ”€β”€ Build a neural network   β”œβ”€β”€ Build a complete ML framework
β”œβ”€β”€ Jupyter notebook demos   β”œβ”€β”€ Full Python package with CLI
β”œβ”€β”€ Isolated examples        β”œβ”€β”€ Integrated: tensors β†’ layers β†’ training
└── "Here's how ReLU works"  β”œβ”€β”€ Production patterns: testing, profiling
                            └── "Here's how EVERYTHING connects"

🎯 Key Insight:
You learn systems engineering, not just individual algorithms. Like understanding how every part of a compiler interacts to turn code into executable programs.


πŸ’‘ "Can't I just read papers/books instead of implementing?"

πŸ“š Reading vs. πŸ”§ Building:

Reading about neural networks:     Building neural networks:
β”œβ”€β”€ "I understand the theory"      β”œβ”€β”€ "Why are my gradients exploding?"
β”œβ”€β”€ "Backprop makes sense"         β”œβ”€β”€ "Oh, that's why we need gradient clipping"
β”œβ”€β”€ "Adam is better than SGD"      β”œβ”€β”€ "Now I see when each optimizer works"
└── Theoretical knowledge          └── Deep intuitive understanding

🌟 The Reality Check:
Implementation forces you to confront reality - edge cases, numerical stability, memory management, performance trade-offs that papers gloss over.


πŸ€” "Isn't everything a Transformer now? Why learn old architectures?"

Great question! Transformers are indeed dominant, but they're built on the same foundations you'll implement:

πŸ—οΈ Transformer Building Blocks You'll Build:

  • Attention is just matrix operations - which you'll build from tensors
  • LayerNorm uses your activations and layers
  • Adam optimizer powers Transformer training - you'll implement it
  • Multi-head attention = your Linear layers + reshaping

🎯 The Strategic Reality:
Understanding foundations makes you the engineer who can optimize Transformers, not just use them. Plus, CNNs still power computer vision, RNNs drive real-time systems, and new architectures emerge constantly.


πŸŽ“ "I'm already good at ML. Is this too basic for me?"

πŸ§ͺ Challenge Test - Can You:

  • Implement Adam optimizer from the paper? (Not just use torch.optim.Adam)
  • Explain why ReLU causes dying neurons and how to fix it?
  • Debug a 50% accuracy drop after model deployment?

πŸ’ͺ Why Advanced Engineers Love TinyTorch:
It fills the "implementation gap" that most ML education skips. You'll go from understanding concepts to implementing production systems.


πŸ§ͺ "Is this academic or practical?"

Both! TinyTorch bridges academic understanding with engineering reality:

πŸŽ“ Academic Rigor:

  • Mathematical foundations implemented correctly
  • Proper testing and validation methodologies
  • Research-quality implementations you can trust

βš™οΈ Engineering Practicality:

  • Production-style code organization and CLI tools
  • Performance considerations and optimization techniques
  • Real datasets, realistic scale, professional development workflow

⏰ "How much time does this take?"

πŸ“Š Time Investment: ~40-60 hours for complete framework

🎯 Flexible Learning Paths:

  • Quick exploration: 1-2 modules to understand the approach
  • Focused learning: Core modules (01-10) for solid foundations
  • Complete mastery: All 16 modules for full framework expertise

✨ Self-Paced Design:
Each module is self-contained, so you can stop and start as needed.


πŸ”„ "What if I get stuck or confused?"

πŸ›‘οΈ Built-in Support System:

  • Progressive scaffolding: Each step builds on the previous, with guided implementations
  • Comprehensive testing: 200+ tests ensure your code works correctly
  • Rich documentation: Visual explanations, real-world context, debugging tips
  • Professional error messages: Helpful feedback when things go wrong
  • Modular design: Skip ahead or go back without breaking your progress

πŸ’‘ Learning Philosophy:
The course is designed to guide you through complexity, not leave you struggling alone.


πŸš€ "What can I build after completing TinyTorch?"

πŸ—οΈ Your Framework Becomes the Foundation For:

  • Research projects: Implement cutting-edge papers on solid foundations
  • Specialized systems: Computer vision, NLP, robotics applications
  • Performance engineering: GPU kernels, distributed training, quantization
  • Custom architectures: New layer types, novel optimizers, experimental designs

🎯 Ultimate Skill Unlock:
You'll have the implementation skills to turn any ML paper into working code.


About

TinyπŸ”₯Torch is a minimalist framework for building machine learning systems from scratchβ€”from tensors to systems.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published