Apple Silicon support #5

amrohendawi · 2024-01-20T12:08:54Z

In order to locally develop applications with this great model it is necessary to support it on apple silicon M1/M2/M3 chips.

The main missing requirement is flash_attn

The text was updated successfully, but these errors were encountered:

Zymrael · 2024-01-21T17:34:08Z

This can be done by swapping out the inner_mha_cls in AttentionBlock forward call, using any other generic PyTorch implementation of attention.

OdedKBio · 2024-03-25T11:55:06Z

This can be done by swapping out the inner_mha_cls in AttentionBlock forward call, using any other generic PyTorch implementation of attention.

@Zymrael Can you elaborate more? seems like I need to switch in more then one place

Provide feedback