0.1.6
- Fix dynamic generator fallback mode (was broken for prompts longer than max_input_len)
- Fix inference on ROCm wave64 devices
- Made model conversion script part of
exllamav2
package - CPU optimizations
Full Changelog: v0.1.5...v0.1.6
exllamav2
packageFull Changelog: v0.1.5...v0.1.6