Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Self-Attention in TransformerBlock #8

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

neonbjb
Copy link

@neonbjb neonbjb commented Feb 26, 2019

Adds a use_self_attention parameter to the TransformerBlock
constructor which allows the use of this block in self
attention mode. This is useful for creating decoders in
machine translation tasks, for example.

Feel free to revert the changes I made to support Python 3.

Adds a use_self_attention parameter to the TransformerBlock
constructor which allows the use of this block in self
attention mode. This is useful for creating decoders in
machine translation tasks, for example.
@kpot
Copy link
Owner

kpot commented Feb 26, 2019

Hi!
Could you please post an example that utilizes these changes? Perhaps a function that builds a model, similar to vanilla_transformer_gpt_model.

@neonbjb
Copy link
Author

neonbjb commented Feb 26, 2019

Absolutely! I have an sample NMT model which works pretty well with your library. I'll need a day or two to clean it up before submitting it, though.

Adds a use_self_attention parameter to the TransformerBlock
constructor which allows the use of this block in self
attention mode. This is useful for creating decoders in
machine translation tasks, for example.
@neonbjb
Copy link
Author

neonbjb commented Feb 27, 2019

Done - I've tested run_nmt on tensorflow. Not sure what backend you're using but I don't suspect there should be a problem with others as I never directly use tf.

@kpot kpot added the enhancement New feature or request label Mar 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants