Add Transformer as embedding net #1413

manuelgloeckler · 2025-02-27T07:37:13Z

🚀 Feature Request

Transformers can be a flexible embedding network for general data modalities. We currently have permutation-invariant networks, whereas plain transformers are permutation equivariant (allowing support for exchangeable but not independent data). With suitable positional embeddings, this can also serve as a general embedding network.

Describe the solution you'd like

Todo so the following steps have to be completed:

Add a PyTorch transformer class here
Currently, all flows will need a "statically" size input. So, the output sequence of the transformer needs to be "pooled" into a single vector of fixed dimension. There are multiple ways to do this and this needs some testing/literature research on what we want to use as default (but multiple methods can be implemented).
Add tests

📌 Additional Context

Currently, other "sequence" models like the permutation-invariant networks support learning on sequences of different sizes in parallel using "nan"-padding. One can think of adding this support here, too (if not please add an additional issue).

The issue #1324 #218 does currently soft-block variable sequence lengths, but should not have an effect on this feature request.

manuelgloeckler added enhancement New feature or request hackathon embedding_net New default embedding nets labels Feb 27, 2025

manuelgloeckler mentioned this issue Feb 27, 2025

Improve embedding nets #1414

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Transformer as embedding net #1413

Add Transformer as embedding net #1413

manuelgloeckler commented Feb 27, 2025 •

edited

Loading

Add Transformer as embedding net #1413

Add Transformer as embedding net #1413

Comments

manuelgloeckler commented Feb 27, 2025 • edited Loading

🚀 Feature Request

Describe the solution you'd like

📌 Additional Context

manuelgloeckler commented Feb 27, 2025 •

edited

Loading