-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] libcudart.so.12: cannot open shared object file: No such file or directory #2584
Comments
Hi @githust66 Could you help verify this #2590 |
ok,Is it pulling the latest code to build from source? |
Nope. You only need to change the Python code. |
ok, I'll give it a try |
|
fixed with #2601 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Checklist
Describe the bug
There are no CUDA-related libraries in the rocm environment, but the SGLANG 0.4.1 version will report an error, while the 0.4.0 and earlier versions will not
error info:
ImportError: [address=0.0.0.0:39501, pid=13418] libcudart.so.12: cannot open shared object file: No such file or directory
2024-12-26 10:14:09,664 xinference.api.restful_api 4247 ERROR [address=0.0.0.0:39501, pid=13418] libcudart.so.12: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/api/restful_api.py", line 1002, in launch_model
model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
return self._process_result_message(result)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 667, in send
result = await self._run_coro(message.message_id, coro)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/supervisor.py", line 1041, in launch_builtin_model
await _launch_model()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/supervisor.py", line 1005, in _launch_model
await _launch_one_model(rep_model_uid)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/supervisor.py", line 984, in _launch_one_model
await worker_ref.launch_builtin_model(
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
return self._process_result_message(result)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 667, in send
result = await self._run_coro(message.message_id, coro)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/utils.py", line 90, in wrapped
ret = await func(*args, **kwargs)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/worker.py", line 897, in launch_builtin_model
await model_ref.load()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
return self._process_result_message(result)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 667, in send
result = await self._run_coro(message.message_id, coro)
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/core/model.py", line 414, in load
self._model.load()
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/xinference/model/llm/sglang/core.py", line 135, in load
self._engine = sgl.Runtime(
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/api.py", line 39, in Runtime
from sglang.srt.server import Runtime
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/server.py", line 47, in
from sglang.srt.managers.data_parallel_controller import (
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/managers/data_parallel_controller.py", line 25, in
from sglang.srt.managers.io_struct import (
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/managers/io_struct.py", line 24, in
from sglang.srt.managers.schedule_batch import BaseFinishReason
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/managers/schedule_batch.py", line 40, in
from sglang.srt.configs.model_config import ModelConfig
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/configs/model_config.py", line 24, in
from sglang.srt.layers.quantization import QUANTIZATION_METHODS
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/layers/quantization/init.py", line 25, in
from sglang.srt.layers.quantization.fp8 import Fp8Config
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/layers/quantization/fp8.py", line 31, in
from sglang.srt.layers.moe.fused_moe_triton.fused_moe import padding_size
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/layers/moe/fused_moe_triton/init.py", line 4, in
import sglang.srt.layers.moe.fused_moe_triton.fused_moe # noqa
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py", line 14, in
from sgl_kernel import moe_align_block_size as sgl_moe_align_block_size
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sgl_kernel/init.py", line 1, in
from .ops import (
File "/root/miniconda3/envs/xinf/lib/python3.10/site-packages/sgl_kernel/ops/init.py", line 1, in
from .custom_reduce_cuda import all_reduce as _all_reduce
ImportError: [address=0.0.0.0:39501, pid=13418] libcudart.so.12: cannot open shared object file: No such file or directory
2024-12-26 10:14:09,665 uvicorn.access 4247 INFO 127.0.0.1:47452 - "POST /v1/models HTTP/1.1" 500
Reproduction
qwen2.5-instruct-7B
Environment
(xinf) root@DESKTOP-ESRGKIB:~# python -m sglang.check_env
2024-12-26 10:26:55.241465: E external/local_xla/xla/stream_executor/plugin_registry.cc:91] Invalid plugin kind specified: FFT
2024-12-26 10:26:56.971386: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-26 10:26:57.864187: E external/local_xla/xla/stream_executor/plugin_registry.cc:91] Invalid plugin kind specified: DNN
WARNING 12-26 10:27:06 rocm.py:31]
fork
method is not supported by ROCm. VLLM_WORKER_MULTIPROC_METHOD is overridden tospawn
instead.Python: 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0]
ROCM available: True
GPU 0: AMD Radeon RX 7900 XT
GPU 0 Compute Capability: 11.0
ROCM_HOME: /opt/rocm
HIPCC: HIP version: 6.3.42131-fa1d09cbd
ROCM Driver Version:
PyTorch: 2.4.0+rocm6.3.0
sglang: 0.4.1
flashinfer: Module Not Found
triton: 3.0.0+rocm6.3.0_75cc27c26a
transformers: 4.46.2
torchao: 0.6.1
numpy: 1.26.4
aiohttp: 3.10.10
fastapi: 0.115.4
hf_transfer: 0.1.8
huggingface_hub: 0.26.5
interegular: 0.3.3
modelscope: 1.19.2
orjson: 3.10.11
packaging: 24.1
psutil: 6.1.0
pydantic: 2.9.2
multipart: 0.0.12
zmq: 26.2.0
uvicorn: 0.32.0
uvloop: 0.21.0
vllm: 0.6.6.dev44+gc2d1b075.d20241221
openai: 1.54.1
anthropic: 0.39.0
decord: 0.6.0
AMD Topology:
Hypervisor vendor: Microsoft
ulimit soft: 1024
The text was updated successfully, but these errors were encountered: