-
Notifications
You must be signed in to change notification settings - Fork 44
Open
Description
作者您好!我最近在我自己的数据集调试chord算法。我发现chord里面可供设置的超参很多。我不太清楚如果要调试的话应该从哪里入手。请问您有什么建议吗?谢谢
另外,请问下面这个样例的yaml里面的超参是你们最终做实验用的超参吗?
algorithm:
algorithm_type: mix_chord
repeat_times: 8 # or 16 for better performance in math related tasks
kl_loss_fn_args:
kl_coef: 0.0
sample_strategy_args:
expert_data_ratio: 0.20
policy_loss_fn_args: # feel free to change, we encourage you to try out different hyperparameters
mu_warmup_steps: 200 # 0 for chord-mu and chord-phi
mu_decay_steps: 400 # 200 for chord-mu and 0 for chord-phi
mu_peak: 0.5 # 0.9 for chord-mu and 0.1 for chord-phi
mu_valley: 0.02 # 0.05 for chord-mu and 0.1 for chord-phi
enable_phi_function: true # false for chord-mu and true for chord-phi
clip_range: 0.2
sft_loss_agg_mode: "token-mean"
use_dynamic_bsz: true
ppo_mini_batch_size: 320 # 320 = 256 + 64; if you set repeat times = 16, then it shoudle be 32 * 16 + 64
ppo_micro_batch_size_per_gpu: 4
ngpus_trainer: 4
train_batch_size_expert: 64
train_batch_size_usual: 256 # 32 batchsize * 8 repeat times
谢谢!
Metadata
Metadata
Assignees
Labels
No labels