Consensus Group Relative Policy Optimization for Text Generation

Scripts for MBR/C-GRPO experiments.

The experiments conducted using NVIDIA A100 GPUs with 80 GB of VRAM.

Structure

bash scripts/setup.sh

bash scripts/run_c_grpo.sh

Use scripts/run_mbr.sh for using MBR decoding. You can edit arguments at the top of each script.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
c-grpo.png		c-grpo.png
requirements.txt		requirements.txt