Muffin/reward_model at master · wangjs9/Muffin

README.md

Download the llama-7b-hf model from Hugging Face

cd dataset

Download the dataset for empathetic response classification from behavioral-data/Empathy-Mental-Health.

Download the dataset for strategy classification from Motivational-Interviewing-Dataset.

Process the dataset (including obtain the dataset for coherence classification) using the following command:

python data_process.py
cd ../

python finetune.py --base_model decapoda-research/llama-7b-hf --output_dir ./lora-alpaca

python test.py --base_model decapoda-research/llama-7b-hf --lora_weight ./lora-alpaca