Skip to content

Commit cfba4de

Browse files
authored
[Bugfix] Fix logit soft cap in flash-attn backend (vllm-project#7425)
1 parent d2bc451 commit cfba4de

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

vllm/attention/backends/flash_attn.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -563,6 +563,7 @@ def forward(
563563
softmax_scale=self.scale,
564564
causal=True,
565565
alibi_slopes=self.alibi_slopes,
566+
softcap=self.logits_soft_cap,
566567
).squeeze(1)
567568

568569
# Reshape the output tensor.

0 commit comments

Comments
 (0)