This directory contains a script convert_neox_pt_to_huggingface_neox.py
to convert PolyCoder checkpoints trained by gpt-neox into HuggingFace format, and a script generate.py
to load the converted model and generate code from a given prompt.
Shoutout to @NinedayWang for implementing this!
transformers 4.23.1
You can use the convert.sh
script to convert specified model to the HuggingFace format, using ./convert.sh 0-4B
(or pass a different model size). This script in turn invokes convert_neox_pt_to_huggingface_neox.py
, which you can also call directly as follows:
python convert_neox_pt_to_huggingface_neox.py \
--checkpoint_dir ../checkpoints/checkpoints-0-4B/global_step150000 \
--vocab_file ../Data/code-vocab.json \
--merge_file ../Data/code-merges.txt \
--hf_config_path ./polycoder/configs/config_0-4B.json \
--hf_save_dir ./polycoder/0-4B
HuggingFace configuration files for different size models are provided in polycoder/configs/
, including config_0-4B.json
, config_2-7B.json
and config_160M.json
.
After running, you can get a complete HuggingFace model in the directory specified by hf_save_dir
. If the directory does not exist, it can be built automatically.
The following is an example to load the converted 0.4B HuggingFace model and generate code from a given prompt:
python generate.py \
--model_name_or_path ./polycoder/0-4B \
--temperature 0.2 \
--top_p 0.95 \
--max_length 128
You can evaluate models of other sizes by specifying model_name_or_path
.