You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wonder the uw and us are two vectors as global weights, or there are different uw(s) for each sentence, and different us(s) for each document?
From the code I think these are global vectors, am I right? Please help me confirm this.
As in the model_components.py it is said
Performs task-specific attention reduction, using learned
attention context vector (constant within task of interest).
The uw or us are defined in the function task_specific_attention(), although they are both referred to the attention_context_vector, but in the computational graph, are they different vectors? It would be helpful if you could explain a little about this part.
I believe you are right. Uw and Us are the global context vectors that stores information about which words or sentences are most informative respectively. They are learned during the training process.
Thank you ematvey for this paper.
I wonder the uw and us are two vectors as global weights, or there are different uw(s) for each sentence, and different us(s) for each document?
From the code I think these are global vectors, am I right? Please help me confirm this.
As in the model_components.py it is said
The uw or us are defined in the function task_specific_attention(), although they are both referred to the attention_context_vector, but in the computational graph, are they different vectors? It would be helpful if you could explain a little about this part.
attention_context_vector = tf.get_variable(name='attention_context_vector', shape=[output_size], initializer=initializer, dtype=tf.float32)
Thank you.
The text was updated successfully, but these errors were encountered: