- "*A: In Graph Attention Networks (GAT), a variation of the attention mechanism is used that doesn't explicitly employ separate key, query, and value vectors as in traditional self-attention mechanisms like those found in the Transformer model. Instead, GAT directly computes attention coefficients (attention scores) based on the node embeddings. In GATs, each node in the graph has an associated learnable parameter vector **$a$** for the node itself (called a \"self-attention\" mechanism) and for its neighbors. The attention mechanism computes attention scores between a node and its neighbors based on the dot product of the node's embeddings and the neighbor nodes' embeddings. These attention scores are then used to compute a weighted sum of the neighbor nodes' values, which is used to update the node's representation.\n",
0 commit comments