[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. #4691

K11OntheBoat · 2025-10-30T08:49:05Z

Motivation

支持Qwen3-235B能够在FD使用PD + EP 部署.

paddle-bot · 2025-10-30T08:49:12Z

Thanks for your contribution!

K11OntheBoat · 2025-11-03T03:03:17Z

fastdeploy/worker/worker_process.py

-                create=False,
-            )
-            step_shm_value.value[0] = -1
+        if not envs.ENABLE_V1_KVCACHE_SCHEDULER:


当前这里和最新代码有冲突，后面直接使用最新代码即可(解决的问题一样).

zhoutianzi666 · 2025-11-03T09:13:42Z

fastdeploy/model_executor/layers/moe/moe.py

-                shard_id=shard_id,
-                shard_dim=SHARD_ID_TO_SHARDED_DIM[shard_id],
-            )
+        if expert_id - self.expert_id_offset >= 0 and expert_id - self.expert_id_offset < self.num_local_experts:


外面直接用了expert_id，结果里面还加了if expert_id is None？

CLAassistant · 2025-11-03T11:54:50Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ bukejiyu
❌ K11OntheBoat

K11OntheBoat seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

zhoutianzi666 · 2025-11-04T03:25:08Z

fastdeploy/model_executor/utils.py

            quant_method = getattr(model_sublayer, "quant_method", None)
            if not hasattr(quant_method, "process_weights_after_loading"):
                return
+            if param is not None and hasattr(param, "tensor_track") and param.tensor_track is None:


这个是啥意思？没看懂

binhan的代码里，可能出现重复处理tensor_track的逻辑，加了这一行才可以规避

zhoutianzi666 · 2025-11-04T03:26:02Z

fastdeploy/worker/worker_process.py

            num_experts = model_config.moe_num_experts[0]
        else:
            num_experts = model_config.moe_num_experts
-


这这修改还是规避吧，不要引入这种历史在文件中

zhoutianzi666 · 2025-11-04T03:27:20Z

fastdeploy/config.py

        if not hasattr(self, "mla_use_absorb"):
            self.mla_use_absorb = False

+        if hasattr(self, "num_experts") and getattr(self, "moe_num_experts") is None:


这种修改最好和原始的self.num_experts赋值放在一起，单独放在这里很突兀

之前问了risheng，这里专门是一个override函数，就是兼容不同的命名方法的，放回原来的地方的话，就和重构之前一样了，又回到旧版本的风格了

gongshaotian

LGTM

K11OntheBoat force-pushed the qwen_pd branch from 23ff660 to 929743f Compare October 31, 2025 03:25

K11OntheBoat commented Nov 3, 2025

View reviewed changes

zhoutianzi666 reviewed Nov 3, 2025

View reviewed changes

K11OntheBoat force-pushed the qwen_pd branch from 929743f to ac06c22 Compare November 3, 2025 11:54

zhoutianzi666 reviewed Nov 4, 2025

View reviewed changes

bukejiyu and others added 2 commits November 5, 2025 11:07

fix ep loading and code style

fa8a315

support Qwen-MoE PD/EP Prefill luanch

35d5222

K11OntheBoat force-pushed the qwen_pd branch from ac06c22 to 35d5222 Compare November 5, 2025 03:07

zhoutianzi666 approved these changes Nov 5, 2025

View reviewed changes

gongshaotian approved these changes Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. #4691

[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. #4691

K11OntheBoat commented Oct 30, 2025

Uh oh!

paddle-bot bot commented Oct 30, 2025

Uh oh!

K11OntheBoat Nov 3, 2025

Uh oh!

zhoutianzi666 Nov 3, 2025

Uh oh!

K11OntheBoat Nov 3, 2025

Uh oh!

CLAassistant commented Nov 3, 2025

Uh oh!

zhoutianzi666 Nov 4, 2025

Uh oh!

K11OntheBoat Nov 4, 2025

Uh oh!

zhoutianzi666 Nov 4, 2025

Uh oh!

zhoutianzi666 Nov 4, 2025

Uh oh!

K11OntheBoat Nov 4, 2025

Uh oh!

gongshaotian left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. #4691

Are you sure you want to change the base?

[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. #4691

Conversation

K11OntheBoat commented Oct 30, 2025

Motivation

Uh oh!

paddle-bot bot commented Oct 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Nov 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongshaotian left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants