Skip to content

Conversation

@petercad
Copy link

@petercad petercad commented Nov 5, 2025

Small reorder cleanups:

  • Use reorder in FlashAttention for tSrS->tArP now that CUTE_ARCH_REORDER_XE_ENABLED macro bug is fixed
  • One more CUTE_ARCH_REORDER_XE_ENABLED macro fix
  • Remove f32->bf16 UU inline asm reorder sequence as compiler is already doing a good job with Universal_Reorder_UU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants