Differences between small, medium and large models #3517
-
I know that the main differences between _sm, _md and _lg models are mostly statistical (the larger the model, the more accurate it is) and related to their respective size. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The central difference are the word vectors (and vocabulary). They're also the main reason for the difference in size – the rest of the components don't really change in size across models. If you provide word vectors during training, those will be used by the model, which can give you a nice boost in accuracy. However, it also means that the word vectors have to be present at runtime. If you look at the vectors info in the models directory, you can see that the number of unique vectors differs between models. For example, the |
Beta Was this translation helpful? Give feedback.
The central difference are the word vectors (and vocabulary). They're also the main reason for the difference in size – the rest of the components don't really change in size across models.
If you provide word vectors during training, those will be used by the model, which can give you a nice boost in accuracy. However, it also means that the word vectors have to be present at runtime. If you look at the vectors info in the models directory, you can see that the number of unique vectors differs between models. For example, the
md
model has 20k, and the regularlg
model 685k.