Skip to content

Differences between small, medium and large models #3517

Locked Answered by ines
Discussion options

You must be logged in to vote

The central difference are the word vectors (and vocabulary). They're also the main reason for the difference in size – the rest of the components don't really change in size across models.

If you provide word vectors during training, those will be used by the model, which can give you a nice boost in accuracy. However, it also means that the word vectors have to be present at runtime. If you look at the vectors info in the models directory, you can see that the number of unique vectors differs between models. For example, the md model has 20k, and the regular lg model 685k.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage models Issues related to the statistical models
2 participants
Converted from issue

This discussion was converted from issue #3517 on December 10, 2020 13:55.