Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ELECTRA/TensorFlow2] Minor: README Has Misleading Description Of Warmup #1328

Open
psharpe99 opened this issue Jul 13, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@psharpe99
Copy link

Related to ELECTRA/TensorFlow2

Describe the bug
The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:

README.md:- <warmup_steps_p1> is the percentage of training steps used for warm-up at the start of training. Default is 2000.
README.md:- <warmup_steps_p2> is the percentage of training steps used for warm-up at the start of training. Default is 200.

and

--num_warmup_steps NUM_WARMUP_STEPS
- Number of steps of training to perform linear learning
rate warmup for. For example, 0.1 = 10% of training.

This is misleading.

To Reproduce
See README.md

Expected behavior
The text should be changed to reflect that this is intended to be an integer number of steps.
It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps.
For example
training steps: 10,000
warmup steps: 2,000
leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.

Environment
N/A

@psharpe99 psharpe99 added the bug Something isn't working label Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant