-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
"sharded is not supported for AutoModel" Error When Deploying SageMaker Endpoint For Qwen 2.5 7B Trained via SageMaker
#2783
opened Nov 26, 2024 by
jjbuck
2 of 4 tasks
"RuntimeError: weight lm_head.weight does not exist" quantizing Llama-3.2-11B-Vision-Instruct
#2775
opened Nov 22, 2024 by
akowalsk
2 of 4 tasks
The same model, but different loading methods will result in very different inference speeds?
#2757
opened Nov 19, 2024 by
hjs2027864933
2 of 4 tasks
Regression in 2.4.0 : Input Valdidation errors return code 200 and do not return the error message
#2749
opened Nov 15, 2024 by
leonarddls
2 of 4 tasks
On-The-Fly Quantization for Inference appears not to be working as per documentation.
#2748
opened Nov 15, 2024 by
colin-byrneireland1
1 of 4 tasks
Different inference results and speed between /generate and OpenAI endpoint
#2747
opened Nov 14, 2024 by
jegork
2 of 4 tasks
In dev mode, server is stuck at Server started at unix:///tmp/text-generation-server-0
#2735
opened Nov 10, 2024 by
mokeddembillel
2 of 4 tasks
launch TGI with the argument
--max-input-tokens
smaller than sliding_window=4096 (got here max_input_tokens=16384)
#2730
opened Nov 7, 2024 by
ashwincv0112
1 of 4 tasks
device-side assert triggered when trying to use LLaMA 3.2 Vision with grammar
#2729
opened Nov 6, 2024 by
SokolAnn
2 of 4 tasks
Python client: Pydantic protected namespace "model_"
#2722
opened Nov 4, 2024 by
Simon-Stone
4 tasks
FlashLlamaForCausalLM
's using name dense
for its mlp submodule causes error when using LoRA adapter
#2715
opened Nov 2, 2024 by
sadra-barikbin
CUDA Error: No kernel image is available for execution on the device
#2703
opened Oct 28, 2024 by
shubhamgajbhiye1994
2 of 4 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.