Pooling question

I'm running some tests with StarEncoder, and I'm using your code as a starting point. When returning an embedding, you pool input token embeddings into a single vector in here:

https://github.com/bigcode-project/bigcode-encoder/blob/10ace393752f9ffdd16136516d1663f05fd18286/src/utils.py#L152

As I read the code, you simply pick the last valid (non-masked) token's embedding as the pooled embedding vector for the entire sequence. This should be the vector corresponding to the `<sep>` separator token, if I get it correctly.

Can you explain why you do this? Is this something similar to CLS-pooling from BERT? Do you think this leads to better results than other approaches (e.g., mean-pooling)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pooling question #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pooling question #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions