-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: Error When VLLM_USE_TRITON_FLASH_ATTN=True on 2*8H100
bug
Something isn't working
#13607
opened Feb 20, 2025 by
phoenixsqf
1 task done
[Bug]: When using cpu inference, is the kv cache's physical memory space pre-allocated?
bug
Something isn't working
#13603
opened Feb 20, 2025 by
905799575
1 task done
[Usage] [V1] Refactor speculative decoding configuration
#13601
opened Feb 20, 2025 by
LiuXiaoxuanPKU
[Installation]: how to use benchmarks in docker?
installation
Installation problems
#13598
opened Feb 20, 2025 by
kkoren
1 task done
[Bug]: RuntimeError: No CUDA GPUs are available in transformers v4.48.0 or above when running Ray RLHF example
bug
Something isn't working
#13597
opened Feb 20, 2025 by
ArthurinRUC
1 task done
[Installation]: AttributeError: '_OpNamespace' '_C' object has no attribute 'silu_and_mul' on the CPU instance
installation
Installation problems
#13593
opened Feb 20, 2025 by
ganapativs
1 task done
[Bug]: Marlin kernel doesn't work for multi-gpus
bug
Something isn't working
#13590
opened Feb 20, 2025 by
meqiangxu
1 task done
[Bug]: arm64 No module named 'xformers'
bug
Something isn't working
#13585
opened Feb 20, 2025 by
jiayi-1994
1 task done
[Bug]: Use Qwen 2.5-VL with TP=2, the memory of one GPU card will be cleared to zero during the request.
bug
Something isn't working
#13581
opened Feb 20, 2025 by
coderchem
1 task done
[Bug]: ValueError: The checkpoint you are trying to load has model type Something isn't working
qwen2_5_vl
but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
bug
#13579
opened Feb 20, 2025 by
jieguolove
1 task done
[Feature]: Add moe_wna16 kernel as a backend for CompressedTensorsWNA16MoEMethod
feature request
#13575
opened Feb 20, 2025 by
mgoin
1 task done
[Feature]: Support for Running Classification Task in Online Server
feature request
good first issue
Good for newcomers
help wanted
Extra attention is needed
#13567
opened Feb 19, 2025 by
sam-h-bean
1 task done
[Bug]: structured output with xgrammar using vllm serve with llama-8b fails results in os error OSError: OSError: (...)/.cache/torch_extensions/py312_cu124/xgrammar/xgrammar.so: cannot open shared object file: No such file or directory
bug
Something isn't working
#13563
opened Feb 19, 2025 by
ExplodedViewMelon
1 task done
[Bug]: Index Out of Range Bug in Pooler when Using returned_token_ids with hidden_states
bug
Something isn't working
#13559
opened Feb 19, 2025 by
QiaoZiqing
1 task done
[Bug]: Ray fails to register worker when running DeepSeek R1 model with vLLM and tensor parallelism
bug
Something isn't working
#13557
opened Feb 19, 2025 by
yangchou19
1 task done
[Bug]: Increasing root volume with guided decoding
bug
Something isn't working
#13556
opened Feb 19, 2025 by
abpani
1 task done
[Usage]: How to use logits processors with max_num_seqs > 1?
usage
How to use vllm
#13553
opened Feb 19, 2025 by
alejopaullier96
1 task done
[Bug]: there are some nccl erros when tp_size > 8 in offline inference
bug
Something isn't working
#13552
opened Feb 19, 2025 by
yingtongxiong
1 task done
[Bug]: Make https://wheels.vllm.ai/nightly inspectable
bug
Something isn't working
#13545
opened Feb 19, 2025 by
fxmarty-amd
1 task done
[Feature]: support image_embeds in openai api as well
feature request
#13540
opened Feb 19, 2025 by
gyin94
1 task done
[Performance]: enforce_eager=False degrade the performance metrics for long context input
performance
Performance-related issues
#13536
opened Feb 19, 2025 by
chenfengshijie
1 task done
[Bug]: Ray+vllm run, then crash
bug
Something isn't working
#13535
opened Feb 19, 2025 by
fantasy-mark
1 task done
[New Model]: facebook/contriever support requring
help wanted
Extra attention is needed
new model
Requests to new models
#13525
opened Feb 19, 2025 by
yichuan520030910320
1 task done
[Bug]: Can't serve on ray cluster although passing VLLM_HOST_IP
bug
Something isn't working
#13521
opened Feb 19, 2025 by
hahmad2008
1 task done
[Usage]: Does vllm support mix deploy on GPU+CPU?
usage
How to use vllm
#13517
opened Feb 19, 2025 by
zengqingfu1442
1 task done
Previous Next
ProTip!
no:milestone will show everything without a milestone.