-
Notifications
You must be signed in to change notification settings - Fork 5
Model Training Guide
Matthew Altenburg edited this page May 16, 2025
·
1 revision
This document explains how to train the custom JoeyLLM GPT-2 model.
- Python 3.10+
- NVIDIA GPU with CUDA 12.4 + Pytorch with gpu support
- Required packages:
pip install -r requirements.txt
JoeyLLM/
├── main.py # Entry point script: loads config and starts training
├── requirements.txt # Project dependencies
├── README.md # Project overview and usage instructions
├── docker/
│ └── Dockerfile # Docker image definition for environment setup
├── docs/
│ ├── CONTRIBUTING # Contribution guidelines
│ ├── LICENSE # License file (e.g. MIT)
│ └── train # Project documentation related to training
├── test/
│ └── pre_run_test.py # Sanity checks or tests before training
└── src/
├── configs/
│ └── config.yaml # YAML configuration (Hydra) for model, training, logging
├── data/
│ ├── dataset.py # Loads and batches tokenized training data
│ ├── test_data.py # Loads/handles validation or test datasets
│ └── chunk.py # Preprocessing script to split long sequences into chunks
├── model/
│ ├── joeyllm.py # Custom GPT-2 model (transformers, decoder blocks, attention)
│ └── test_model.py # Unit tests or evaluation scripts for model components
├── tokenizer/
│ ├── train_tokenizer.py # Trains a tokenizer using a raw text corpus
│ └── test_tokenizer.py # Validates tokenizer output and decoding accuracy
└── train/
├── loop.py # Core training loop (epochs, logging, checkpointing)
└── optimizer.py # Optimizer setup (AdamW)
-Before running training, login to Weights & Biases: wandb login
python src/main.py
This will:
Load the custom GPT-2 model with chosen configuration
Load 25% of the Project_Gutenberg_Australia dataset
Begin training with full logging to Weights & Biases
Save checkpoints after every epoch