Skip to content

Releases: oobabooga/text-generation-webui

v1.9.1

05 Jul 10:38
e813b32
Compare
Choose a tag to compare

Bug fixes

  • UI: Fix some broken chat histories not showing in the "Past chats" menu.
  • Prevent llama.cpp from being monkey patched more than once, avoiding an infinite recursion error.

v1.9

05 Jul 03:24
3315d00
Compare
Choose a tag to compare

Backend updates

  • 4-bit and 8-bit kv cache options have been added to llama.cpp and llamacpp_HF. They reuse the existing --cache_8bit and --cache_4bit flags. Thanks @GodEmperor785 for figuring out what values to pass to llama-cpp-python.
  • Transformers:
    • Add eager attention option to make Gemma-2 work correctly (#6188). Thanks @GralchemOz.
    • Automatically detect bfloat16/float16 precision when loading models in 16-bit precision.
    • Automatically apply eager attention to models with Gemma2ForCausalLM architecture.
    • Gemma-2 support: Automatically detect and apply the optimal settings for this model with the two changes above. No need to set --bf16 --use_eager_attention manually.
  • Automatically obtain the EOT token from Jinja2 templates and add it to the stopping strings, fixing Llama-3-Instruct not stopping. No need to add <eot> to the custom stopping strings anymore.

UI updates

  • Whisper STT overhaul: this extension has been rewritten, replacing the Gradio microphone component with a custom microphone element that is much more reliable (#6194). Thanks @RandomInternetPreson, @TimStrauven, and @mamei16.
  • Make the character dropdown menu coexist in the "Chat" tab and the "Parameters > Character" tab, after some people pointed out that moving it entirely to the Chat tab makes it harder to edit characters.
  • Colors in the light theme have been improved, making it a bit more aesthetic.
  • Increase the chat area on mobile devices.

Bug fixes

  • Fix the API request to AUTOMATIC1111 in the sd-api-pictures extension.
  • Fix a glitch when switching tabs with "Show controls" unchecked in the chat tab and extensions loaded.

Library updates

  • llama-cpp-python: bump to 0.2.81 (adds Gemma-2 support).
  • Transformers: bump to 4.42 (adds Gemma-2 support).

Support

v1.8

27 Jun 02:38
6915c50
Compare
Choose a tag to compare

Releases with version numbers are back! The last one was v1.7 in October 8th, 2023, so I am calling this one v1.8.

From this release on, it will be possible to install past releases by downloading the .zip source and running the start_ script in it. The installation script no longer updates to the latest version automatically. This doesn't apply to snapshots/releases before this one.

New backend

UI updates

  • Improved "past chats" menu: this menu is now a vertical list of text items instead of a dropdown menu, making it a lot easier to switch between past conversations. Only one click is required instead of two.
  • Store the chat history in the browser: if you restart the server and do not refresh the browser, your conversation will not be accidentally erased anymore.
  • Avoid some unnecessary calls to the backend, making the UI faster and more responsive.
  • Move the "Character" droprown menu to the main Chat tab, to make it faster to switch between different characters.
  • Change limits of RoPE scaling sliders in UI (#6142). Thanks @GodEmperor785.
  • Do not expose "alpha_value" for llama.cpp and "rope_freq_base" for transformers to keep things simple and avoid conversions.
  • Remove an obsolete info message intended for GPTQ-for-LLaMa.
  • Remove the "Tab" shortcut to switch between the generation tabs and the "Parameter" tabs, as it was awkward.
  • Improved streaming of lists, which would flicker and temporarily display horizontal lines sometimes.

Bug fixes

  • Revert the reentrant generation lock to a simple lock, fixing an issue caused by the change.
  • Fix GGUFs with no BOS token present, mainly qwen2 models. (#6119). Thanks @Ph0rk0z.
  • Fix "500 error" issue caused by block_requests.py (#5976). Thanks @nero-dv.
  • Setting default alpha_value and fixing loading some newer DeepSeekCoder GGUFs (#6111). Thanks @mefich.

Library updates

  • llama-cpp-python: bump to 0.2.79 (after a month of wrestling with GitHub Actions).
  • ExLlamaV2: bump to 0.1.6.
  • flash-attention: bump to 2.5.9.post1.
  • PyTorch: bump to 2.2.2. That's the last 2.2 patch version.
  • HQQ: bump to 0.1.7.post3. Makes HQQ functional again.

Other updates

  • Do not "git pull" during installation, allowing previous releases (from this one on) to be installed.
  • Make logs more readable, no more \u7f16\u7801 (#6127). Thanks @Touch-Night.

Support this project

snapshot-2024-04-28

28 Apr 20:20
ad12236
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: snapshot-2024-04-21...snapshot-2024-04-28

snapshot-2024-04-21

21 Apr 20:19
a4b732c
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: snapshot-2024-04-14...snapshot-2024-04-21

snapshot-2024-04-14

14 Apr 22:22
26d822f
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: snapshot-2024-04-07...snapshot-2024-04-14

snapshot-2024-04-07

07 Apr 20:19
91a7370
Compare
Choose a tag to compare

What's Changed

Full Changelog: snapshot-2024-03-31...snapshot-2024-04-07

snapshot-2024-03-31

31 Mar 20:20
1a7c027
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: snapshot-2024-03-24...snapshot-2024-03-31

snapshot-2024-03-24

24 Mar 20:19
7cf1402
Compare
Choose a tag to compare

snapshot-2024-03-17

17 Mar 20:19
7cf1402
Compare
Choose a tag to compare

What's Changed

Full Changelog: snapshot-2024-03-10...snapshot-2024-03-17