Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练过程中albert占用的显存很大 #28

Open
fatmelon opened this issue Nov 26, 2019 · 3 comments
Open

训练过程中albert占用的显存很大 #28

fatmelon opened this issue Nov 26, 2019 · 3 comments

Comments

@fatmelon
Copy link

你好:
我使用同样的数据pipeline训练QA模型,使用bert-wwm的时候可以设置batchsize到12,使用albert-xxlarge-v2只能设置batchsize到6。但是albert-xxlarge-v2的模型文件本身只有900M左右而bert-wwm的模型文件有1400M,请问有什么可能的原因造成这种情况吗?

@lonePatient
Copy link
Owner

@fatmelon 有1400M的bert-wwm模型???

@shawroad
Copy link

你好:
我使用同样的数据pipeline训练QA模型,使用bert-wwm的时候可以设置batchsize到12,使用albert-xxlarge-v2只能设置batchsize到6。但是albert-xxlarge-v2的模型文件本身只有900M左右而bert-wwm的模型文件有1400M,请问有什么可能的原因造成这种情况吗?

你还是没有理解albert的优点。 它并不是模型小,占的显存就小。 比如albert的base 虽然只有五六十兆,但是和bert-base(三百多兆)显存基本一致。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants