Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vllm] - AttributeError: '_OpNamespace' '_vllm_fa2_C' object has no attribute 'varlen_fwd' #800

Open
lvhuaizi opened this issue Jan 26, 2025 · 7 comments
Labels
question Further information is requested

Comments

@lvhuaizi
Copy link

lvhuaizi commented Jan 26, 2025

执行:
vllm serve /DATA/disk0/ld/ld_model_pretrain/MiniCPM-o-2_6 --dtype auto --max-model-len 2048 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code

报错如下:
ERROR 01-26 11:32:37 engine.py:380] File "/autodl-fs/data/github/vllm/vllm/vllm_flash_attn/flash_attn_interface.py", line 154, in flash_attn_varlen_func
ERROR 01-26 11:32:37 engine.py:380] out, softmax_lse = torch.ops._vllm_fa2_C.varlen_fwd(
AttributeError: '_OpNamespace' '_vllm_fa2_C' object has no attribute 'varlen_fwd'

官网的教程竟然也会报错!
官网的教程竟然也会报错!
官网的教程竟然也会报错!

@lvhuaizi lvhuaizi added the question Further information is requested label Jan 26, 2025
@lvhuaizi lvhuaizi changed the title [vllm] - <AttributeError: '_OpNamespace' '_vllm_fa2_C' object has no attribute 'varlen_fwd'> [vllm] - AttributeError: '_OpNamespace' '_vllm_fa2_C' object has no attribute 'varlen_fwd' Jan 26, 2025
@HwwwwwwwH
Copy link
Contributor

这看起来这报错和我们对 MiniCPM-o-2_6 的前端适配没有关系...可能需要检查一下 CUDA 的版本。
另外我们现在的 MiniCPM-o-2_6 已经合进了vllm官方的仓库中,你可以拿下来再试一下,如果还是有报错,我可以帮你去vllm官方提issue看看。

@biraj-outspeed
Copy link

biraj-outspeed commented Feb 3, 2025

@HwwwwwwwH I'm also facing the same Issue with your fork.

Edit 2:
✅ ✅ ✅ I updated offical vLLM to version 0.7.1 and the above issue was resolved.

I followed the following steps first:

git clone https://github.com/OpenBMB/vllm.git
cd vllm
git checkout minicpmo
python3 -m venv myenv
source myenv/bin/activate
VLLM_USE_PRECOMPILED=1 pip install --editable . --no-cache-dir
export HF_TOKEN=<my-hf-token>
vllm serve openbmb/MiniCPM-o-2_6 \
    --trust-remote-code \
    --max-model-len 2048 \
    --max-num-seq 128

Then I got this error:

  File "/home/ubuntu/biraj/vllm/myenv/lib/python3.12/site-packages/torch/_ops.py", line 1225, in __getattr__
    raise AttributeError(
AttributeError: '_OpNamespace' '_vllm_fa2_C' object has no attribute 'varlen_fwd'

Here's is the full error trace.

I also tried using official vLLM since I noticed that openbmb/MiniCPM-o-2_6 is listed in vLLM's supported models. However, running vLLM's docker image raised error saying that the model is not supported.

Command:

    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<my-hf-token>" \
    -p 8000:8000 \
    --ipc=host \
    -d \
    vllm/vllm-openai:latest \
    --model openbmb/MiniCPM-o-2_6 \
    --trust-remote-code \
    --max-model-len 2048 \
    --max-num-seq 128

Error:

ValueError: Model architectures ['MiniCPMO'] are not supported for now.

Edit 1:
I get the same error for vLLM 0.7.0 without Docker.

~/biraj/minicpmo$ vllm --version
INFO 02-03 04:52:09 __init__.py:183] Automatically detected platform cuda.
0.7.0

Command: Same vllm serve that I mentioned at the begging of this comment.

Error:

ValueError: Model architectures ['MiniCPMO'] are not supported for now.

Edit 2:
I updated vLLM to version 0.7.1 and the above issue was resolved.

@YiJunShenS
Copy link

这看起来这报错和我们对 MiniCPM-o-2_6 的前端适配没有关系...可能需要检查一下 CUDA 的版本。 另外我们现在的 MiniCPM-o-2_6 已经合进了vllm官方的仓库中,你可以拿下来再试一下,如果还是有报错,我可以帮你去vllm官方提issue看看。
你好,我使用了最新版本的vllm==0.7.1,然后运行时候报了另一个错:AttributeError: 'MiniCPMOProcessor' object has no attribute 'get_audio_placeholder'

@HwwwwwwwH
Copy link
Contributor

这看起来这报错和我们对 MiniCPM-o-2_6 的前端适配没有关系...可能需要检查一下 CUDA 的版本。 另外我们现在的 MiniCPM-o-2_6 已经合进了vllm官方的仓库中,你可以拿下来再试一下,如果还是有报错,我可以帮你去vllm官方提issue看看。
你好,我使用了最新版本的vllm==0.7.1,然后运行时候报了另一个错:AttributeError: 'MiniCPMOProcessor' object has no attribute 'get_audio_placeholder'

还有更多报错的 traceback 吗?

@YiJunShenS
Copy link

这看起来这报错和我们对 MiniCPM-o-2_6 的前端适配没有关系...可能需要检查一下 CUDA 的版本。 另外我们现在的 MiniCPM-o-2_6 已经合进了vllm官方的仓库中,你可以拿下来再试一下,如果还是有报错,我可以帮你去vllm官方提issue看看。
你好,我使用了最新版本的vllm==0.7.1,然后运行时候报了另一个错:AttributeError: 'MiniCPMOProcessor' object has no attribute 'get_audio_placeholder'

还有更多报错的 traceback 吗?

您好,以下是具体的报错的traceback:
[rank0]: Traceback (most recent call last):
[rank0]: File "/U03/syj/test.py", line 10, in
[rank0]: llm = LLM(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/utils.py", line 1039, in inner
[rank0]: return fn(*args, **kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 240, in init
[rank0]: self.llm_engine = self.engine_class.from_engine_args(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 482, in from_engine_args
[rank0]: engine = cls(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 274, in init
[rank0]: self._initialize_kv_caches()
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 414, in _initialize_kv_caches
[rank0]: self.model_executor.determine_num_available_blocks())
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 99, in determine_num_available_blocks
[rank0]: results = self.collective_rpc("determine_num_available_blocks")
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 49, in collective_rpc
[rank0]: answer = run_method(self.driver_worker, method, args, kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/utils.py", line 2208, in run_method
[rank0]: return func(*args, **kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/worker/worker.py", line 228, in determine_num_available_blocks
[rank0]: self.model_runner.profile_run()
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1236, in profile_run
[rank0]: self._dummy_run(max_num_batched_tokens, max_num_seqs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1301, in _dummy_run
[rank0]: .dummy_data_for_profiling(self.model_config,
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/inputs/registry.py", line 333, in dummy_data_for_profiling
[rank0]: dummy_data = profiler.get_dummy_data(seq_len)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/profiling.py", line 161, in get_dummy_data
[rank0]: mm_inputs = self._get_dummy_mm_inputs(seq_len, mm_counts)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/profiling.py", line 139, in _get_dummy_mm_inputs
[rank0]: return self.processor.apply(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/model_executor/models/minicpmv.py", line 803, in apply
[rank0]: result = super().apply(prompt, mm_data, hf_processor_mm_kwargs)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 1230, in apply
[rank0]: hf_mm_placeholders = self._find_mm_placeholders(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 793, in _find_mm_placeholders
[rank0]: return find_mm_placeholders(mm_prompt_repls, new_token_ids,
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 579, in find_mm_placeholders
[rank0]: return dict(full_groupby_modality(it))
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 184, in full_groupby_modality
[rank0]: return full_groupby(values, key=lambda x: x.modality)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/utils.py", line 873, in full_groupby
[rank0]: for value in values:
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 534, in _iter_placeholders
[rank0]: replacement = repl_info.get_replacement(item_idx)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/multimodal/processing.py", line 270, in get_replacement
[rank0]: replacement = replacement(item_idx)
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/model_executor/models/minicpmo.py", line 355, in get_replacement_minicpmv
[rank0]: return self.get_audio_prompt_texts(
[rank0]: File "/root/miniconda3/envs/minicpm-o/lib/python3.10/site-packages/vllm/model_executor/models/minicpmo.py", line 232, in get_audio_prompt_texts
[rank0]: return self.info.get_hf_processor().get_audio_placeholder(
[rank0]: AttributeError: 'MiniCPMOProcessor' object has no attribute 'get_audio_placeholder'

@HwwwwwwwH
Copy link
Contributor

可以更新一下最新HF仓库的代码。

@YiJunShenS
Copy link

可以更新一下最新HF仓库的代码。

你好,解决了谢谢~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants