Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiny refactor and cleanup #75

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Conversation

fzyzcjy
Copy link
Contributor

@fzyzcjy fzyzcjy commented Jan 2, 2025

This is a very tiny code cleanup containing:

  1. Refactor update_sampling_params. It originally uses a global variable and update states back and forth, and now uses immutable local variables.
  2. Try to avoid mutate config in-place, which may cause a bit of confusion and bug in logic
  3. Add val_before_train: true in the default config. This is because, otherwise users has to write +trainer.val_before_train=..., while all other configs has the style of trainer.val_before_train=...

Note: This PR starts from #74, thus the GitHub diff page will be clearer after #74 is merged. (Ignore this line - that PR is already merged)

@fzyzcjy fzyzcjy changed the title Tiny refactor / cleanup [WIP] Tiny refactor / cleanup Jan 2, 2025
@fzyzcjy fzyzcjy marked this pull request as ready for review January 2, 2025 13:06
@fzyzcjy fzyzcjy changed the title [WIP] Tiny refactor / cleanup Tiny refactor and cleanup Jan 2, 2025
Copy link
Collaborator

@eric-haibin-lin eric-haibin-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for the contribution!

verl/trainer/config/ppo_megatron_trainer.yaml Show resolved Hide resolved
@PeterSH6
Copy link
Collaborator

PeterSH6 commented Jan 4, 2025

Nice work! I'll review it later today.

@fzyzcjy
Copy link
Contributor Author

fzyzcjy commented Jan 4, 2025

thx for the contribution!

You are welcome!

Nice work!

Thanks!

I'll review it later today.

Take your time :)

@@ -20,6 +20,8 @@

import os

from verl.utils.config import config_normalize_batch_size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this import above L35, which would be tidier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(will do that in the next batch after discussions of other things)

verl/utils/config.py Show resolved Hide resolved
@contextmanager
def update_sampling_params(self, **kwargs):
# update sampling params
old_sampling_params_args = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our original implementation also used a local variable while only maintaining one SamplingParams.

It seems that your refactor introduced deepcopy.
Could you clarify the benefit of your refactor more? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I personally think the refactor is a little bit better, because we now avoid mutable global state, which seems to be generally not preferable in software engineering (e.g. a bit less readable and error-prone). The deepcopy is slower, but it seems the time cost is negligible, especially compared with the very heavy generate sequences.

Anyway, as the title suggests, this is just a tiny refactor, and feel free to revert it if you do not want :)

This reverts commit 0bf552d.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants