RoPE (TensorFlow implementation) from the paper Transformer Quality in Linear Time.
Based on the original work of paper Enhanced Transformer with Rotary Position Embedding.
Warning
This repository is under development, but please feel free to explore and provide any feedback or suggestions you may have. 🚧
!pip install git+https://github.com/brandnewchoppa/rope-tensorflow.git
import tensorflow as tf
from rope_tensorflow import RoPE
rotary_emb = RoPE(dim = 32)
q = tf.random.normal([1, 128, 64])
k = tf.random.normal([1, 128, 64])
q = rotary_emb.rotate(q)
k = rotary_emb.rotate(k)
Introduced in A Length-Extrapolatable Transformer paper.
import tensorflow as tf
from rope_tensorflow import RoPE
rotary_emb = RoPE(dim = 32)
q = tf.random.normal([1, 128, 64])
k = tf.random.normal([1, 128, 64])
q, k = rotary_emb.rotate([q, k])
Introduced in Position Interpolation paper.
Aims to extend the context size of pretrained models.
import tensorflow as tf
from rope_tensorflow import RoPE
rotary_emb = RoPE(dim = 32, interpolate_factor = 2.0)
@article{Hua2022TransformerQI,
title = {Transformer Quality in Linear Time},
author = {Weizhe Hua and Zihang Dai and Hanxiao Liu and Quoc V. Le},
journal = {ArXiv},
year = {2022},
volume = {abs/2202.10447}
}
@misc{Su2021RoFormer,
title = {RoFormer: Enhanced Transformer with Rotary Position Embedding},
author = {Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu},
journal = {ArXiv},
year = {2021},
volume = {abs/2104.09864}
}
@misc{rope-eleutherai,
title = {Rotary Embeddings: A Relative Revolution},
author = {Biderman, Stella and Black, Sid and Foster, Charles and Gao, Leo and Hallahan, Eric and He, Horace and Wang, Ben and Wang, Phil},
howpublished = \url{blog.eleuther.ai/},
note = {[Online; accessed ]},
year = {2021}
}