SFT configs for Qwen coder models #438

edbeeching · 2025-02-26T09:13:21Z

Saving for reference, no need to review:

LR scan

sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.00 zero3 "--learning_rate 5e-6 --hub_model_revision v00.02 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.02"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.00 zero3 "--learning_rate 1e-5 --hub_model_revision v00.03 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.03"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.00 zero3 "--learning_rate 2e-5 --hub_model_revision v00.04 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.04"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.01 zero3 "--learning_rate 5e-6 --hub_model_revision v00.05 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.05"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.01 zero3 "--learning_rate 1e-5 --hub_model_revision v00.06 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.06"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.01 zero3 "--learning_rate 2e-5 --hub_model_revision v00.07 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.07"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v00.01 zero3 "--learning_rate 4e-5 --hub_model_revision v00.08 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v00.08"

sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.00 zero3 "--learning_rate 5e-6 --hub_model_revision v01.02 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.02"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.00 zero3 "--learning_rate 1e-5 --hub_model_revision v01.03 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.03"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.00 zero3 "--learning_rate 2e-5 --hub_model_revision v01.04 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.04"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.01 zero3 "--learning_rate 5e-6 --hub_model_revision v01.05 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.05"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.01 zero3 "--learning_rate 1e-5 --hub_model_revision v01.06 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.06"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.01 zero3 "--learning_rate 2e-5 --hub_model_revision v01.07 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.07"
sbatch --job-name or1_sft_Qwen2.5-Coder-7B --nodes 1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v01.01 zero3 "--learning_rate 4e-5 --hub_model_revision v00.08 --output_dir data/open-r1/Qwen2.5-Coder-7B-Instruct-SFT-v01.08"

lewtun · 2025-03-03T15:21:48Z

My scans

# V02.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.00 --run_name=qwen-7B_codeforces-cot_v02.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.01 --run_name=qwen-7B_codeforces-cot_v02.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.02 --run_name=qwen-7B_codeforces-cot_v02.02 --wandb_entity huggingface --wandb_project open-r1'
# V02.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.10 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=1.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.10 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.10 --run_name=qwen-7B_codeforces-cot_v02.10 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.11 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=2.1e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.11 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.11 --run_name=qwen-7B_codeforces-cot_v02.11 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v02.12 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v02.00 zero3 '--learning_rate=4.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v02.12 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v02.12 --run_name=qwen-7B_codeforces-cot_v02.12 --wandb_entity huggingface --wandb_project open-r1'

# V03.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v03.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v03.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v03.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v03.00 --run_name=qwen-7B_codeforces-cot_v03.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v03.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v03.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v03.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v03.01 --run_name=qwen-7B_codeforces-cot_v03.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v03.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v03.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v03.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v03.02 --run_name=qwen-7B_codeforces-cot_v03.02 --wandb_entity huggingface --wandb_project open-r1'

# V04.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v04.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v04.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v04.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v04.00 --run_name=qwen-7B_codeforces-cot_v04.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v04.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v04.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v04.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v04.01 --run_name=qwen-7B_codeforces-cot_v04.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v04.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v04.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v04.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v04.02 --run_name=qwen-7B_codeforces-cot_v04.02 --wandb_entity huggingface --wandb_project open-r1'

# V05.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v05.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v05.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v05.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v05.00 --run_name=qwen-7B_codeforces-cot_v05.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v05.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v05.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v05.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v05.01 --run_name=qwen-7B_codeforces-cot_v05.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v05.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v05.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v05.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v05.02 --run_name=qwen-7B_codeforces-cot_v05.02 --wandb_entity huggingface --wandb_project open-r1'

# V06.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.00 --run_name=qwen-7B_codeforces-cot_v06.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.01 --run_name=qwen-7B_codeforces-cot_v06.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.02 --run_name=qwen-7B_codeforces-cot_v06.02 --wandb_entity huggingface --wandb_project open-r1'
# V06.0X = lr scan, packing=false
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.10 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=1.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.10 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.10 --run_name=qwen-7B_codeforces-cot_v06.10 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.11 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=2.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.11 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.11 --run_name=qwen-7B_codeforces-cot_v06.11 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v06.12 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v06.00 zero3 '--learning_rate=4.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v06.12 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v06.12 --run_name=qwen-7B_codeforces-cot_v06.12 --wandb_entity huggingface --wandb_project open-r1'

# V07.0X = lr scan, packing=true
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.00 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=1.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.00 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.00 --run_name=qwen-7B_codeforces-cot_v07.00 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.01 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=2.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.01 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.01 --run_name=qwen-7B_codeforces-cot_v07.01 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.02 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=4.0e-5 --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.02 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.02 --run_name=qwen-7B_codeforces-cot_v07.02 --wandb_entity huggingface --wandb_project open-r1'
# V07.0X = lr scan, packing=false
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.10 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=1.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.10 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.10 --run_name=qwen-7B_codeforces-cot_v07.10 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.11 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=2.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.11 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.11 --run_name=qwen-7B_codeforces-cot_v07.11 --wandb_entity huggingface --wandb_project open-r1'
sbatch --mail-type=ALL [email protected]  --output=/fsx/h4/logs/%x-%j.out --err=/fsx/h4/logs/%x-%j.err --job-name=qwen-7B_codeforces-cot_v07.12 --nodes=1 slurm/train.slurm Qwen2.5-Coder-7B-Instruct sft v07.00 zero3 '--learning_rate=4.0e-5 --packing=false --hub_model_id=open-r1/Qwen2.5-Coder-7B-Instruct-SFT --hub_model_revision=v07.12 --output_dir=data/Qwen2.5-Coder-7B-Instruct-SFT-v07.12 --run_name=qwen-7B_codeforces-cot_v07.12 --wandb_entity huggingface --wandb_project open-r1'

edbeeching and others added 5 commits February 26, 2025 09:11

configs

56f9257

Merge branch 'main' into qwen-coder-sft-configs

dae6e9a

Add codeforces recipes

2080600

Add v06

a6f44b2

Merge branch 'main' into qwen-coder-sft-configs

d9b7074

lewtun added 7 commits March 3, 2025 15:22

Add v07

ba27a99

Merge branch 'main' into qwen-coder-sft-configs

bc281a2

Add v08

f82658d

Add 32B recipe

0523624

Disable Liger

6bab2d8

Add fsdp

4bb2495

Fix optim

a6b8da7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SFT configs for Qwen coder models #438

SFT configs for Qwen coder models #438

edbeeching commented Feb 26, 2025 •

edited

Loading

lewtun commented Mar 3, 2025

SFT configs for Qwen coder models #438

Are you sure you want to change the base?

SFT configs for Qwen coder models #438

Conversation

edbeeching commented Feb 26, 2025 • edited Loading

lewtun commented Mar 3, 2025

edbeeching commented Feb 26, 2025 •

edited

Loading