You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
#1527 adds a script to convert ElectraForPretrain parameters to ElectraModel version. SQuAD script doesn't allow specifying the generator dimension and layer scaling factors so the checkpoints are still not loadable there yet..
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
bugSomething isn't workingenhancementNew feature or request
Description
As part of #1413 I was running the ELECTRA-base model and found several issues along the way.
dataloader KeyError error message
The pre-training script can't resume from last checkpoint.Pre-training scripts should allow resuming from checkpoints #1526SQuAD parameter loading error message
To Reproduce
Follow the steps in https://github.com/dmlc/gluon-nlp/blob/09f343564e4f735df52e212df87ca073a824e829/scripts/pretraining/README.md. See below for the exact commands I used.
Steps to reproduce
Environment
I ran both scripts on p4dn.24xlarge with an environment bootstrapped by this cloudformation template. Details on some important dependencies:
HOROVOD_WITH_MXNET=1 HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_GLOO=1 python3 -m pip install --no-cache-dir horovod
for Horovod 0.21.1.The text was updated successfully, but these errors were encountered: