vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.8k
Star 38.6k

Code
Issues 1.3k
Pull requests 442
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025

#11862 opened Jan 8, 2025 by simon-mo

Open 5

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 43

Labels 59 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,282 Open 5,336 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Error When VLLM_USE_TRITON_FLASH_ATTN=True on 2*8H100 bug

Something isn't working

#13607 opened Feb 20, 2025 by phoenixsqf

1 task done

[Bug]: When using cpu inference, is the kv cache's physical memory space pre-allocated? bug

Something isn't working

#13603 opened Feb 20, 2025 by 905799575

1 task done

[Usage] [V1] Refactor speculative decoding configuration

#13601 opened Feb 20, 2025 by LiuXiaoxuanPKU

[Installation]: how to use benchmarks in docker? installation

Installation problems

#13598 opened Feb 20, 2025 by kkoren

1 task done

[Bug]: RuntimeError: No CUDA GPUs are available in transformers v4.48.0 or above when running Ray RLHF example bug

Something isn't working

#13597 opened Feb 20, 2025 by ArthurinRUC

1 task done

[Installation]: AttributeError: '_OpNamespace' '_C' object has no attribute 'silu_and_mul' on the CPU instance installation

Installation problems

#13593 opened Feb 20, 2025 by ganapativs

1 task done

[Bug]: Marlin kernel doesn't work for multi-gpus bug

Something isn't working

#13590 opened Feb 20, 2025 by meqiangxu

1 task done

[Bug]: arm64 No module named 'xformers' bug

Something isn't working

#13585 opened Feb 20, 2025 by jiayi-1994

1 task done

[Bug]: Use Qwen 2.5-VL with TP=2, the memory of one GPU card will be cleared to zero during the request. bug

Something isn't working

#13581 opened Feb 20, 2025 by coderchem

1 task done

[Bug]: ValueError: The checkpoint you are trying to load has model type qwen2_5_vl but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. bug

Something isn't working

#13579 opened Feb 20, 2025 by jieguolove

1 task done

[Feature]: Add moe_wna16 kernel as a backend for CompressedTensorsWNA16MoEMethod feature request

#13575 opened Feb 20, 2025 by mgoin

1 task done

[Feature]: Support for Running Classification Task in Online Server feature request good first issue

Good for newcomers

help wanted

Extra attention is needed

#13567 opened Feb 19, 2025 by sam-h-bean

1 task done

[Bug]: structured output with xgrammar using vllm serve with llama-8b fails results in os error OSError: OSError: (...)/.cache/torch_extensions/py312_cu124/xgrammar/xgrammar.so: cannot open shared object file: No such file or directory bug

Something isn't working

#13563 opened Feb 19, 2025 by ExplodedViewMelon

1 task done

[Bug]: Index Out of Range Bug in Pooler when Using returned_token_ids with hidden_states bug

Something isn't working

#13559 opened Feb 19, 2025 by QiaoZiqing

1 task done

[Bug]: Ray fails to register worker when running DeepSeek R1 model with vLLM and tensor parallelism bug

Something isn't working

#13557 opened Feb 19, 2025 by yangchou19

1 task done

[Bug]: Increasing root volume with guided decoding bug

Something isn't working

#13556 opened Feb 19, 2025 by abpani

1 task done

[Usage]: How to use logits processors with max_num_seqs > 1? usage

How to use vllm

#13553 opened Feb 19, 2025 by alejopaullier96

1 task done

[Bug]: there are some nccl erros when tp_size > 8 in offline inference bug

Something isn't working

#13552 opened Feb 19, 2025 by yingtongxiong

1 task done

[Bug]: Make https://wheels.vllm.ai/nightly inspectable bug

Something isn't working

#13545 opened Feb 19, 2025 by fxmarty-amd

1 task done

[Feature]: support image_embeds in openai api as well feature request

#13540 opened Feb 19, 2025 by gyin94

1 task done

[Performance]: enforce_eager=False degrade the performance metrics for long context input performance

Performance-related issues

#13536 opened Feb 19, 2025 by chenfengshijie

1 task done

[Bug]: Ray+vllm run, then crash bug

Something isn't working

#13535 opened Feb 19, 2025 by fantasy-mark

1 task done

[New Model]: facebook/contriever support requring help wanted

Extra attention is needed

new model

Requests to new models

#13525 opened Feb 19, 2025 by yichuan520030910320

1 task done

[Bug]: Can't serve on ray cluster although passing VLLM_HOST_IP bug

Something isn't working

#13521 opened Feb 19, 2025 by hahmad2008

1 task done

[Usage]: Does vllm support mix deploy on GPU+CPU? usage

How to use vllm

#13517 opened Feb 19, 2025 by zengqingfu1442

1 task done

Previous 1 2 3 4 5 … 51 52 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly