We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In order to locally develop applications with this great model it is necessary to support it on apple silicon M1/M2/M3 chips.
The main missing requirement is flash_attn
flash_attn
The text was updated successfully, but these errors were encountered:
This can be done by swapping out the inner_mha_cls in AttentionBlock forward call, using any other generic PyTorch implementation of attention.
inner_mha_cls
Sorry, something went wrong.
@Zymrael Can you elaborate more? seems like I need to switch in more then one place
No branches or pull requests
In order to locally develop applications with this great model it is necessary to support it on apple silicon M1/M2/M3 chips.
The main missing requirement is
flash_attn
The text was updated successfully, but these errors were encountered: