You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Transformers can be a flexible embedding network for general data modalities. We currently have permutation-invariant networks, whereas plain transformers are permutation equivariant (allowing support for exchangeable but not independent data). With suitable positional embeddings, this can also serve as a general embedding network.
Currently, all flows will need a "statically" size input. So, the output sequence of the transformer needs to be "pooled" into a single vector of fixed dimension. There are multiple ways to do this and this needs some testing/literature research on what we want to use as default (but multiple methods can be implemented).
Add tests
📌 Additional Context
Currently, other "sequence" models like the permutation-invariant networks support learning on sequences of different sizes in parallel using "nan"-padding. One can think of adding this support here, too (if not please add an additional issue).
The issue #1324#218 does currently soft-block variable sequence lengths, but should not have an effect on this feature request.
The text was updated successfully, but these errors were encountered:
🚀 Feature Request
Transformers can be a flexible embedding network for general data modalities. We currently have permutation-invariant networks, whereas plain transformers are permutation equivariant (allowing support for exchangeable but not independent data). With suitable positional embeddings, this can also serve as a general embedding network.
Describe the solution you'd like
Todo so the following steps have to be completed:
📌 Additional Context
Currently, other "sequence" models like the permutation-invariant networks support learning on sequences of different sizes in parallel using "nan"-padding. One can think of adding this support here, too (if not please add an additional issue).
The issue #1324 #218 does currently soft-block variable sequence lengths, but should not have an effect on this feature request.
The text was updated successfully, but these errors were encountered: