Skip to content

请问chord的推荐超参设置是什么 #355

@waltonfuture

Description

@waltonfuture

作者您好!我最近在我自己的数据集调试chord算法。我发现chord里面可供设置的超参很多。我不太清楚如果要调试的话应该从哪里入手。请问您有什么建议吗?谢谢

另外,请问下面这个样例的yaml里面的超参是你们最终做实验用的超参吗?

algorithm:
  algorithm_type: mix_chord
  repeat_times: 8 # or 16 for better performance in math related tasks
  kl_loss_fn_args:
    kl_coef: 0.0
  sample_strategy_args:
    expert_data_ratio: 0.20
  policy_loss_fn_args: # feel free to change, we encourage you to try out different hyperparameters
    mu_warmup_steps: 200  # 0 for chord-mu and chord-phi
    mu_decay_steps: 400 # 200 for chord-mu and 0 for chord-phi
    mu_peak: 0.5 # 0.9 for chord-mu and 0.1 for chord-phi
    mu_valley: 0.02 # 0.05 for chord-mu and 0.1 for chord-phi
    enable_phi_function: true # false for chord-mu and true for chord-phi
    clip_range: 0.2
    sft_loss_agg_mode: "token-mean"
    use_dynamic_bsz: true
    ppo_mini_batch_size: 320 # 320 = 256 + 64; if you set repeat times = 16, then it shoudle be 32 * 16 + 64
    ppo_micro_batch_size_per_gpu: 4
    ngpus_trainer: 4
    train_batch_size_expert: 64
    train_batch_size_usual: 256 # 32 batchsize * 8 repeat times

谢谢!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions