Language Modeling Example with Pytorch Lightning and 🤗 Huggingface Transformers.
Language modeling fine-tuning adapts a pre-trained language model to a new domain and benefits downstream tasks such as classification. The script here applies to fine-tuning masked language modeling (MLM) models include ALBERT, BERT, DistilBERT and RoBERTa, on a text dataset. Details about the models can be found in Transformers model summary.
The Transformers part of the code is adapted from examples/language-modeling/run_mlm.py. Finetuning causal language modeling (CLM) models can be done in a similar way, following run_clm.py.
PyTorch Lightning is "The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate." Quote from its doc:
Organizing your code with PyTorch Lightning makes your code:
- Keep all the flexibility (this is all pure PyTorch), but removes a ton of boilerplate
- More readable by decoupling the research code from the engineering
- Easier to reproduce
- Less error prone by automating most of the training loop and tricky engineering
- Scalable to any hardware without changing your model
pip install -r requirements.txt
To fine-tune a language model, run:
python language_model.py \
--model_name_or_path="The model checkpoint for weights initialization" \
--train_file="The input training data file (a text file)." \
--validation_file="The input validation data file (a text file)."
For example:
python language_model.py \
--model_name_or_path="distilbert-base-cased" \
--train_file="data/wikitext-2/wiki.train.small.raw" \
--validation_file="data/wikitext-2/wiki.valid.small.raw"
To run a “unit test” by running 1 training batch and 1 validation batch:
python language_model.py --fast_dev_run
See language_model.py
and Transformers scrip for more options.
To run with GPU:
python language_model.py --gpus=1
To launch tensorboard:
tensorboard --logdir lightning_logs/