Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additive or multiplicative attention? #16

Open
Ataxias opened this issue Jun 30, 2017 · 0 comments
Open

Additive or multiplicative attention? #16

Ataxias opened this issue Jun 30, 2017 · 0 comments

Comments

@Ataxias
Copy link

Ataxias commented Jun 30, 2017

In the "Attentional Interfaces" section, there is a reference to "Bahdanau, et al. 2014: Neural machine translation by jointly learning to align and translate" (figure). In that paper, the attention vector is calculated through a feed-forward network, using the hidden states of the encoder and decoder as input (this is called "additive attention"). However, the schematic diagram of this section shows that the attention vector is calculated by using the dot product between the hidden states of the encoder and decoder (which is known as multiplicative attention). I believe that a short mention / clarification would be of benefit here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant