File tree Expand file tree Collapse file tree 5 files changed +16
-43
lines changed Expand file tree Collapse file tree 5 files changed +16
-43
lines changed Original file line number Diff line number Diff line change @@ -120,22 +120,22 @@ Supported models list:
120120First, download the image we provide:
121121``` bash
122122# A2 x86
123- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86
123+ docker pull xllm/xllm-ai:xllm-dev-hb-rc2-x86
124124# A2 arm
125- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-arm
125+ docker pull xllm/xllm-ai:xllm-dev-hb-rc2-arm
126126# A3 arm
127- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hc-rc2-arm
127+ docker pull xllm/xllm-ai:xllm-dev-hc-rc2-arm
128128# or
129129# A2 x86
130- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86
130+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86
131131# A2 arm
132- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-arm
132+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-arm
133133# A3 arm
134- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hc-rc2-arm
134+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hc-rc2-arm
135135```
136136Then create the corresponding container:
137137``` bash
138- sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~ /.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86
138+ sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~ /.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/xllm-ai:xllm-dev-hb-rc2-x86
139139```
140140
141141Install official repo and submodules:
Original file line number Diff line number Diff line change @@ -116,22 +116,22 @@ xLLM 提供了强大的智能计算能力,通过硬件系统的算力优化与
116116首先下载我们提供的镜像:
117117``` bash
118118# A2 x86
119- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86
119+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86
120120# A2 arm
121- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-arm
121+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-arm
122122# A3 arm
123- docker pull quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hc-rc2-arm
123+ docker pull quay.io/jd_xllm/xllm-ai:xllm-dev-hc-rc2-arm
124124# 或者
125125# A2 x86
126- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86
126+ docker pull xllm/xllm-ai:xllm-dev-hb-rc2-x86
127127# A2 arm
128- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-arm
128+ docker pull xllm/xllm-ai:xllm-dev-hb-rc2-arm
129129# A3 arm
130- docker pull xllm/xllm-ai:xllm-0.7.1- dev-hc-rc2-arm
130+ docker pull xllm/xllm-ai:xllm-dev-hc-rc2-arm
131131```
132132然后创建对应的容器
133133``` bash
134- sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~ /.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf quay.io/jd_xllm/ xllm-ai:xllm-0.7.1 -dev-hb-rc2-x86
134+ sudo docker run -it --ipc=host -u 0 --privileged --name mydocker --network=host --device=/dev/davinci0 --device=/dev/davinci_manager --device=/dev/devmm_svm --device=/dev/hisi_hdc -v /var/queue_schedule:/var/queue_schedule -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ -v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi -v /usr/local/sbin/:/usr/local/sbin/ -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf -v /var/log/npu/slog/:/var/log/npu/slog -v /export/home:/export/home -w /export/home -v ~ /.ssh:/root/.ssh -v /var/log/npu/profiling/:/var/log/npu/profiling -v /var/log/npu/dump/:/var/log/npu/dump -v /home/:/home/ -v /runtime/:/runtime/ -v /etc/hccn.conf:/etc/hccn.conf xllm/ xllm-ai:xllm-dev-hb-rc2-x86
135135```
136136
137137下载官方仓库与模块依赖:
Original file line number Diff line number Diff line change 1- # Release xllm 0.7.1
2-
3- ## ** Major Features and Improvements**
4-
5- ### Model Support
6-
7- - Support GLM-4.5-Air.
8- - Support Qwen3-VL-Moe.
9-
10- ### Feature
11-
12- - Support scheduler overlap when enable chunked prefill and MTP.
13- - Enable multi-process mode when running VLM model.
14- - Support AclGraph for GLM-4.5.
15-
16- ### Bugfix
17-
18- - Reslove core dump of qwen embedding 0.6B.
19- - Resolve duplicate content in multi-turn tool call conversations.
20- - Support sampler parameters for MTP.
21- - Enable MTP and schedule overlap to work simultaneously.
22- - Resolve google.protobuf.Struct parsing failures which broke tool_call and think toggle functionality.
23- - Fix the precision issue in the Qwen2 model caused by model_type is not be assigned.
24- - Fix core dump of GLM 4.5 when enable MTP.
25- - Temporarily use heap allocation for VLM backend.
26- - Reslove core dump of stream chat completion request for VLM.
27-
281# Release xllm 0.7.0
292
303## ** Major Features and Improvements**
Original file line number Diff line number Diff line change @@ -6,7 +6,7 @@ function error() {
66 exit 1
77}
88
9- IMAGE=" quay.io/jd_xllm/xllm-ai:xllm-0.7.1- dev-hb-rc2-x86"
9+ IMAGE=" quay.io/jd_xllm/xllm-ai:xllm-dev-hb-rc2-x86"
1010
1111RUN_OPTS=(
1212 --rm
Original file line number Diff line number Diff line change 1- 0.7.1
1+ 0.7.0
You can’t perform that action at this time.
0 commit comments