Skip to content

Commit

Permalink
Merge pull request #6166 from oobabooga/dev
Browse files Browse the repository at this point in the history
Merge dev branch
  • Loading branch information
oobabooga committed Jun 27, 2024
2 parents 4820ae9 + 8ec8bc0 commit 6915c50
Show file tree
Hide file tree
Showing 42 changed files with 524 additions and 245 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/stale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ jobs:
- uses: actions/stale@v5
with:
stale-issue-message: ""
close-issue-message: "This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment."
days-before-issue-stale: 60
close-issue-message: "This issue has been closed due to inactivity for 6 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment."
days-before-issue-stale: 180
days-before-issue-close: 0
stale-issue-label: "stale"
days-before-pr-stale: -1
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.
## Features

* 3 interface modes: default (two columns), notebook, and chat.
* Multiple model backends: [Transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp) (through [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)), [ExLlamaV2](https://github.com/turboderp/exllamav2), [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ).
* Multiple model backends: [Transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp) (through [llama-cpp-python](https://github.com/abetlen/llama-cpp-python)), [ExLlamaV2](https://github.com/turboderp/exllamav2), [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM).
* Dropdown menu for quickly switching between different models.
* Large number of extensions (built-in and user-contributed), including Coqui TTS for realistic voice outputs, Whisper STT for voice inputs, translation, [multimodal pipelines](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/multimodal), vector databases, Stable Diffusion integration, and a lot more. See [the wiki](https://github.com/oobabooga/text-generation-webui/wiki/07-%E2%80%90-Extensions) and [the extensions directory](https://github.com/oobabooga/text-generation-webui-extensions) for details.
* [Chat with custom characters](https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab#character).
Expand Down Expand Up @@ -76,12 +76,12 @@ conda activate textgen

| System | GPU | Command |
|--------|---------|---------|
| Linux/WSL | NVIDIA | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121` |
| Linux/WSL | CPU only | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu` |
| Linux | AMD | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.6` |
| MacOS + MPS | Any | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1` |
| Windows | NVIDIA | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121` |
| Windows | CPU only | `pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1` |
| Linux/WSL | NVIDIA | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121` |
| Linux/WSL | CPU only | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cpu` |
| Linux | AMD | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/rocm5.6` |
| MacOS + MPS | Any | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2` |
| Windows | NVIDIA | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121` |
| Windows | CPU only | `pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2` |

The up-to-date commands can be found here: https://pytorch.org/get-started/locally/.

Expand Down Expand Up @@ -146,7 +146,7 @@ Then browse to
1) For Kepler GPUs and older, you will need to install CUDA 11.8 instead of 12:

```
pip3 install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
pip3 install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118
conda install -y -c "nvidia/label/cuda-11.8.0" cuda-runtime
```

Expand Down
4 changes: 2 additions & 2 deletions css/html_instruct_style.css
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@

.gradio-container .chat .user-message {
padding: 20px;
padding-left: 0px;
padding-right: 0px;
padding-left: 0;
padding-right: 0;
background-color: transparent;
border-radius: 8px;
border-bottom-right-radius: 0;
Expand Down
57 changes: 53 additions & 4 deletions css/main.css
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ gradio-app > :first-child {

.header_bar {
background-color: #f7f7f7;
box-shadow: 0 0px 3px rgba(22 22 22 / 35%);
box-shadow: 0 0 3px rgba(22 22 22 / 35%);
margin-bottom: 0;
overflow-x: scroll;
margin-left: calc(-1 * var(--size-4));
Expand Down Expand Up @@ -221,6 +221,7 @@ button {

.pretty_scrollbar::-webkit-scrollbar {
width: 7px;
height: 7px;
}

.pretty_scrollbar::-webkit-scrollbar-track {
Expand All @@ -245,6 +246,10 @@ button {
background: #374151;
}

.pretty_scrollbar::-webkit-scrollbar-corner {
background: transparent;
}

audio {
max-width: 100%;
}
Expand Down Expand Up @@ -433,12 +438,12 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
.message-body code {
white-space: pre-wrap !important;
word-wrap: break-word !important;
border: 1px solid #666666;
border: 1px solid #666;
border-radius: 5px;
font-size: 82%;
padding: 1px 3px;
background: #0d1117 !important;
color: rgb(201, 209, 217);
color: rgb(201 209 217);
}

.message-body pre > code {
Expand Down Expand Up @@ -695,7 +700,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
@media screen and (width >= 1327px) {
#past-chats-row {
position: absolute;
top: 16px;
top: 36px;
left: 0;
width: calc(0.5*(var(--document-width) - 880px - 120px - 16px*2));
max-width: 300px;
Expand Down Expand Up @@ -743,3 +748,47 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
display: none;
}
}

#past-chats {
max-height: calc(100vh - 195px);
overflow-y: scroll !important;
border-radius: 0;
scrollbar-width: none; /* Hide scrollbar in Firefox by default */
}

#past-chats label {
width: 100%;
background-color: transparent !important;
background: none;
border: 0;
border-radius: 0;
padding-top: 8px;
padding-bottom: 8px;
}

#past-chats > :nth-child(2) {
display: none;
}

#past-chats > :nth-child(3) {
gap: 0;
}

#past-chats::-webkit-scrollbar {
display: none;
}

#past-chats:hover {
scrollbar-width: auto;
}

#past-chats:hover::-webkit-scrollbar {
display: block;
}

@media screen and (width < 1327px) {
#past-chats {
max-height: 300px;
}
}

27 changes: 27 additions & 0 deletions docker/TensorRT-LLM/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
FROM pytorch/pytorch:2.2.1-cuda12.1-cudnn8-runtime

# Install Git
RUN apt update && apt install -y git

# System-wide TensorRT-LLM requirements
RUN apt install -y openmpi-bin libopenmpi-dev

# Set the working directory
WORKDIR /app

# Install text-generation-webui
RUN git clone https://github.com/oobabooga/text-generation-webui
WORKDIR /app/text-generation-webui
RUN pip install -r requirements.txt

# This is needed to avoid an error about "Failed to build mpi4py" in the next command
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH

# Install TensorRT-LLM
RUN pip3 install tensorrt_llm==0.10.0 -U --pre --extra-index-url https://pypi.nvidia.com

# Expose the necessary port for the Python server
EXPOSE 7860 5000

# Run the Python server.py script with the specified command
CMD ["python", "server.py", "--api", "--listen"]
4 changes: 2 additions & 2 deletions docs/02 - Default and Notebook Tabs.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ In the **Prompt** menu, you can select from some predefined prompts defined unde

### Output

Four tabs can be found:
Five tabs can be found:

* **Raw**: where the raw text generated by the model appears.
* **Markdown**: it contains a "Render" button. You can click on it at any time to render the current output as markdown. This is particularly useful for models that generate LaTeX equations like GALACTICA.
* **HTML**: displays the output in an HTML style that is meant to be easier to read. Its style is defined under `text-generation-webui/css/html_readable_style.css`.
* **Logits**: when you click on "Get next token probabilities", this tab displays the 50 most likely next tokens and their probabilities based on your current input. If "Use samplers" is checked, the probabilities will be the ones after the sampling parameters in the "Parameters" > "Generation" tab are applied. Otherwise, they will be the raw probabilities generated by the model.
* **Tokens**: allows you to tokenize your prompt and see the ID numbers for the individuals tokens.
* **Tokens**: allows you to tokenize your prompt and see the ID numbers for the individual tokens.

## Notebook tab

Expand Down
2 changes: 1 addition & 1 deletion docs/12 - OpenAI API.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ print()

### Environment variables

The following environment variables can be used (they take precendence over everything else):
The following environment variables can be used (they take precedence over everything else):

| Variable Name | Description | Example Value |
|------------------------|------------------------------------|----------------------------|
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
These files is a mirror of the documentation at:
These files are a mirror of the documentation at:

# https://github.com/oobabooga/text-generation-webui/wiki

Expand Down
19 changes: 5 additions & 14 deletions js/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -98,20 +98,6 @@ document.addEventListener("keydown", function(event) {
document.getElementById("Impersonate").click();
}

// Switch between tabs on Tab
else if (!event.ctrlKey && !event.shiftKey && !event.altKey && !event.metaKey && event.key === "Tab") {
event.preventDefault();
var parametersButton = document.getElementById("parameters-button");
var parentContainer = parametersButton.parentNode;
var selectedChild = parentContainer.querySelector(".selected");

if (selectedChild.id == "parameters-button") {
document.getElementById(previousTabId).click();
} else {
previousTabId = selectedChild.id;
parametersButton.click();
}
}
});

//------------------------------------------------
Expand Down Expand Up @@ -548,3 +534,8 @@ document.querySelectorAll(".focus-on-chat-input").forEach(element => {
document.querySelector("#chat-input textarea").focus();
});
});

//------------------------------------------------
// Fix a border around the "past chats" menu
//------------------------------------------------
document.getElementById("past-chats").parentNode.style.borderRadius = "0px";
2 changes: 1 addition & 1 deletion modules/LoRA.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def add_lora_autogptq(lora_names):
if len(lora_names) > 1:
logger.warning('AutoGPTQ can only work with 1 LoRA at the moment. Only the first one in the list will be loaded.')
if not shared.args.no_inject_fused_attention:
logger.warning('Fused Atttention + AutoGPTQ may break Lora loading. Disable it.')
logger.warning('Fused Attention + AutoGPTQ may break Lora loading. Disable it.')

peft_config = GPTQLoraConfig(
inference_mode=True,
Expand Down
18 changes: 0 additions & 18 deletions modules/RoPE.py

This file was deleted.

28 changes: 18 additions & 10 deletions modules/block_requests.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,19 +43,27 @@ def my_open(*args, **kwargs):
with original_open(*args, **kwargs) as f:
file_contents = f.read()

file_contents = file_contents.replace(b'\t\t<script\n\t\t\tsrc="https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/4.3.9/iframeResizer.contentWindow.min.js"\n\t\t\tasync\n\t\t></script>', b'')
file_contents = file_contents.replace(b'cdnjs.cloudflare.com', b'127.0.0.1')
if len(args) > 1 and args[1] == 'rb':
file_contents = file_contents.decode('utf-8')

file_contents = file_contents.replace('\t\t<script\n\t\t\tsrc="https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/4.3.9/iframeResizer.contentWindow.min.js"\n\t\t\tasync\n\t\t></script>', '')
file_contents = file_contents.replace('cdnjs.cloudflare.com', '127.0.0.1')
file_contents = file_contents.replace(
b'</head>',
b'\n <script src="file/js/katex/katex.min.js"></script>'
b'\n <script src="file/js/katex/auto-render.min.js"></script>'
b'\n <script src="file/js/highlightjs/highlight.min.js"></script>'
b'\n <script src="file/js/highlightjs/highlightjs-copy.min.js"></script>'
b'\n <script>hljs.addPlugin(new CopyButtonPlugin());</script>'
b'\n </head>'
'</head>',
'\n <script src="file/js/katex/katex.min.js"></script>'
'\n <script src="file/js/katex/auto-render.min.js"></script>'
'\n <script src="file/js/highlightjs/highlight.min.js"></script>'
'\n <script src="file/js/highlightjs/highlightjs-copy.min.js"></script>'
'\n <script>hljs.addPlugin(new CopyButtonPlugin());</script>'
'\n </head>'
)

return io.BytesIO(file_contents)
if len(args) > 1 and args[1] == 'rb':
file_contents = file_contents.encode('utf-8')
return io.BytesIO(file_contents)
else:
return io.StringIO(file_contents)

else:
return original_open(*args, **kwargs)

Expand Down
Loading

0 comments on commit 6915c50

Please sign in to comment.