Mini NanoGPT 🚀

English | 简体中文

Mini NanoGPT 🚀

Training a GPT can really be this simple?

Make GPT training fun and approachable! A visual training platform based on karpathy/nanoGPT.

📖 What is this?

Mini-NanoGPT is a tool that helps you get started with training GPT models effortlessly. Whether you're:

🎓 A beginner in deep learning
👨‍🔬 A researcher
🛠️ A developer

Or simply curious about large language models and want to experience their magic,

You can train a model through an intuitive graphical interface!

For the original version of Mini NanoGPT (no longer updated), please check out the old branch.

✨ Key Features

1. Easy to Use

📱 Visual Interface: Say goodbye to command line; point-and-click to start training
🌍 Bilingual UI: Full support for both English and Chinese interfaces
🎯 One-click Operations: Data preprocessing, training, and text generation — all in one click

2. Powerful Features

🔤 Flexible Tokenization: Supports character-level and GPT-2/Qwen tokenizers, with multilingual support
🚄 Efficient Training: Supports multi-process acceleration and distributed training
📊 Real-time Feedback: Live display of training progress and performance
⚙️ Parameter Visualization: All training parameters can be adjusted directly in the UI
🧩 Model Database: Easily manage models and reuse training settings anytime

🚀 Getting Started

Option 1: Docker Deployment (Recommended) 🐳

The easiest way to get started!

# Clone the repository
git clone --depth 1 https://github.com/ystemsrx/mini-nanoGPT.git
cd mini-nanogpt

# Start with Docker Compose (recommended)
docker-compose up --build

# Or build and run manually
docker build -t mini-nanogpt .
docker run --gpus all -p 7860:7860 -v $(pwd)/data:/app/data mini-nanogpt

This will automatically build the Docker image and run the container. The container will automatically detect your system environment (CPU/GPU). Meanwhile, the data, models, and assets directories in the current working directory will be mounted into the container, allowing you to store data directly in these directories.

Once started, visit http://localhost:7860 to access the application.

Option 2: Local Installation

# Clone the repository
git clone --depth 1 https://github.com/ystemsrx/mini-nanoGPT.git
cd mini-nanogpt

# Install dependencies (Python 3.7+)
pip install -r requirements.txt

2. Launch the App

python app.py

Open the displayed link in your browser (usually http://localhost:7860) to see the training interface!

🎮 User Guide

Step 1: Prepare Data

Open the "Data Processing" page, paste your training text, and choose a tokenization method. For better results, check the option to use a tokenizer — it will automatically build a vocabulary based on your text.
If you don't want to use a validation set, check the "Skip validation set" option.
Click "Start Processing" when you're done.

Here's a small example for demonstration:

Step 2: Train the Model

Switch to the "Training" page, and adjust the parameters as needed (or leave them as default for a quick try).
The training and validation loss curves are displayed in real time. If you generated a validation set in Step 1, you should see two curves: blue for training loss, orange for validation loss.
If only one curve is shown, check the terminal output. If you see an error like:
```
Error while evaluating val loss: Dataset too small: minimum dataset(val) size is 147, but block size is 512. Either reduce block size or add more data.
```
it means your block size is too large for the validation set. Try reducing it, for example to 128.
You should now see both loss curves updating dynamically.
Click "Start Training" and wait for training to complete.

Evaluation Mode Only?

This mode lets you evaluate the model's loss on the validation set. Set the Number of Evaluation Seeds to any value >0 to activate evaluation-only mode. You'll see how the model performs with different random seeds.

Step 3: Generate Text

Go to the "Inference" page
Enter a prompt
Click "Generate" and see what the model comes up with!

Step 4: Model Comparison

Go to the "Comparison" page
Select two models to compare — they can even be the same model with different settings
Their configurations will be displayed automatically
You can input the same prompt and see how both models generate text
Or, apply different inference settings (temperature, top_k, etc.) to compare outputs

📁 Project Structure

mini-nanogpt/
├── app.py           # App entry point
├── src/             # Configuration and core modules
├── data/            # Data storage
├── out/             # Model checkpoints
└── assets/          # Tokenizer files and other resources

❓ FAQ

It's running too slowly?

💡 Try reducing batch size or model size
💡 Use a GPU to greatly improve speed
💡 Increase the evaluation interval

The generated text isn’t good?

💡 Try increasing the training data
💡 Tune the model hyperparameters
💡 Adjust the temperature during generation

Want to resume previous training?

💡 On the "Training" page, select "resume" under Initialization
💡 Point to the previous output directory

🤝 Contributing

Suggestions and improvements are welcome! You can:

Submit an Issue
Open a Pull Request
Share your experience using the tool

📝 License

This project is open-sourced under the MIT License.

🎉 Start your GPT journey now!

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
assets		assets
docker		docker
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
app.py		app.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.bat		run.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mini NanoGPT 🚀

Training a GPT can really be this simple?

📖 What is this?

✨ Key Features

1. Easy to Use

2. Powerful Features

🚀 Getting Started

Option 1: Docker Deployment (Recommended) 🐳

Option 2: Local Installation

2. Launch the App

🎮 User Guide

Step 1: Prepare Data

Step 2: Train the Model

Evaluation Mode Only?

Step 3: Generate Text

Step 4: Model Comparison

📁 Project Structure

❓ FAQ

It's running too slowly?

The generated text isn’t good?

Want to resume previous training?

🤝 Contributing

📝 License

About

Uh oh!

Releases

Packages

Languages

License

ystemsrx/mini-nanoGPT

Folders and files

Latest commit

History

Repository files navigation

Mini NanoGPT 🚀

Training a GPT can really be this simple?

📖 What is this?

✨ Key Features

1. Easy to Use

2. Powerful Features

🚀 Getting Started

Option 1: Docker Deployment (Recommended) 🐳

Option 2: Local Installation

2. Launch the App

🎮 User Guide

Step 1: Prepare Data

Step 2: Train the Model

Evaluation Mode Only?

Step 3: Generate Text

Step 4: Model Comparison

📁 Project Structure

❓ FAQ

It's running too slowly?

The generated text isn’t good?

Want to resume previous training?

🤝 Contributing

📝 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages