You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Also interesting, I downloaded Qwen-2.5-3B and it worked great, then downloaded some more models and all a sudden all models are extremely slow, only producing 0.5 token per second. it is 6 t/s when removing and installing again. Pixel 8 / Android 15
Description
Will we see gpu inference to speed up generation
Use Case
In all use cases we want more speed.
The text was updated successfully, but these errors were encountered: