-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Issues: huggingface/text-generation-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
TGI metrics don't have model_name label to indicate which model the metrics belong to
#3026
opened Feb 17, 2025 by
yashaswipiplani
NIX: text_generation_launcher::gpu: Cannot determine GPU compute capability: ImportError: libffi.so.8
#3025
opened Feb 14, 2025 by
celsowm
1 of 4 tasks
WARN text_generation_launcher: Unkown compute for card nvidia-geforce-rtx-3090
#3014
opened Feb 11, 2025 by
bmilesp
Resource underutilization, thread thrashing: CPU affinity ignores allowed CPUs and cannot be switched off
#3011
opened Feb 11, 2025 by
askervin
3 of 4 tasks
Nonsense responses with n-gram speculative decoding
#2997
opened Feb 6, 2025 by
olliestanley
1 of 4 tasks
Request failed during generation: Server error: Value out of range: -29146814772
#2994
opened Feb 5, 2025 by
AlperYildirim1
2 of 4 tasks
Mistral Small 3 : chat template with python functions causes error
#2987
opened Feb 3, 2025 by
v3ss0n
2 tasks done
Error: "new batch size should not exceed padded batch size" when running latest Docker container and sending multiple requests simultaneously
#2985
opened Feb 2, 2025 by
BradyBonnette
2 of 4 tasks
no prefill when decoder_input_details=True from InferenceClient
#2973
opened Jan 30, 2025 by
lifeng-jin
2 of 4 tasks
Incorrect Tokenization Likely Because of Diacritics in OpenChat and LLaMA 3.2 (TGI v3.0.2 and v2.2.0)
#2969
opened Jan 30, 2025 by
biba10
2 of 4 tasks
Structured output doesn't work with open ai endpoint
#2959
opened Jan 27, 2025 by
Stealthwriter
2 of 4 tasks
Running Qwen2-VL-2B-Instruct on TGI is giving an error
#2955
opened Jan 27, 2025 by
ashwani-bhat
2 of 4 tasks
CUDA Out of memory when using the benchmarking tool with batch size greater than 1
#2952
opened Jan 24, 2025 by
mborisov-bi
3 of 4 tasks
Serverless Inference API OpenAI /v1/chat/completions route broken
#2946
opened Jan 23, 2025 by
pelikhan
1 of 4 tasks
RuntimeError: Cannot load 'awq' weight when running Qwen2-VL-72B-Instruct-AWQ model
#2944
opened Jan 23, 2025 by
edesalve
2 of 4 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.