You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:
README.md:- <warmup_steps_p1> is the percentage of training steps used for warm-up at the start of training. Default is 2000.
README.md:- <warmup_steps_p2> is the percentage of training steps used for warm-up at the start of training. Default is 200.
and
--num_warmup_steps NUM_WARMUP_STEPS
- Number of steps of training to perform linear learning
rate warmup for. For example, 0.1 = 10% of training.
This is misleading.
To Reproduce
See README.md
Expected behavior
The text should be changed to reflect that this is intended to be an integer number of steps.
It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps.
For example
training steps: 10,000
warmup steps: 2,000
leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.
Environment
N/A
The text was updated successfully, but these errors were encountered:
Related to ELECTRA/TensorFlow2
Describe the bug
The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:
README.md:-
<warmup_steps_p1>
is the percentage of training steps used for warm-up at the start of training. Default is 2000.README.md:-
<warmup_steps_p2>
is the percentage of training steps used for warm-up at the start of training. Default is 200.and
--num_warmup_steps NUM_WARMUP_STEPS
- Number of steps of training to perform linear learning
rate warmup for. For example, 0.1 = 10% of training.
This is misleading.
To Reproduce
See README.md
Expected behavior
The text should be changed to reflect that this is intended to be an integer number of steps.
It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps.
For example
training steps: 10,000
warmup steps: 2,000
leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.
Environment
N/A
The text was updated successfully, but these errors were encountered: