We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好: 我使用同样的数据pipeline训练QA模型,使用bert-wwm的时候可以设置batchsize到12,使用albert-xxlarge-v2只能设置batchsize到6。但是albert-xxlarge-v2的模型文件本身只有900M左右而bert-wwm的模型文件有1400M,请问有什么可能的原因造成这种情况吗?
The text was updated successfully, but these errors were encountered:
@fatmelon 有1400M的bert-wwm模型???
Sorry, something went wrong.
@lonePatient https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin
你还是没有理解albert的优点。 它并不是模型小,占的显存就小。 比如albert的base 虽然只有五六十兆,但是和bert-base(三百多兆)显存基本一致。
No branches or pull requests
你好:
我使用同样的数据pipeline训练QA模型,使用bert-wwm的时候可以设置batchsize到12,使用albert-xxlarge-v2只能设置batchsize到6。但是albert-xxlarge-v2的模型文件本身只有900M左右而bert-wwm的模型文件有1400M,请问有什么可能的原因造成这种情况吗?
The text was updated successfully, but these errors were encountered: