Document-Level Embeddings in Transformer Model #11715
Unanswered
sunnyifan
asked this question in
Help: Model Advice
Replies: 1 comment
-
Just double-checking: you're aware of how longer texts are split into overlapping strided spans with the span getter (https://spacy.io/api/transformer#span_getters)? So your With the default |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
While using a Transformer-based model like
en_core_web_trf
, two tensors are being exposed fromtrf_data
:num_docs * num_tokens * hidden_size
;num_docs * hidden_size
.How should we interpret the per-document embeddings? From the model structure of RoBERTa, it's likely that the per-document embeddings are the last layer embeddings of
[CLS]
passed thru a linear layer and then tanh. Was this final Linear-tanh layer trained for a specific task?Beta Was this translation helpful? Give feedback.
All reactions