Support MThreads (MUSA) GPU #11618

yeahdongcn · 2026-01-04T10:01:39Z

This PR adds support for Moore Threads (MUSA) GPU platform, expanding ComfyUI's hardware compatibility.

Testing Done

Tested flux1_krea_dev template to generate an image from the given prompt:

Followed https://comfyanonymous.github.io/ComfyUI_examples/wan/ to run a Text to Video workflow (umt5_xxl_fp8_e4m3fn_scaled.safetensors+wan_2.1_vae.safetensors+wan2.1_t2v_1.3B_fp16.safetensors):

Server logs:

root@worker3218:/ws# python main.py --enable-manager
[START] Security scan
[ComfyUI-Manager] Using uv as Python module for pip operations.
Using Python 3.10.12 environment at: /usr
[DONE] Security scan
** ComfyUI startup time: 2026-01-04 17:49:31.633
** Platform: Linux
** Python version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]
** Python executable: /usr/bin/python
** ComfyUI Path: /ws
** ComfyUI Base Folder Path: /ws
** User directory: /ws/user
** ComfyUI-Manager config path: /ws/user/__manager/config.ini
** Log path: /ws/user/comfyui.log
Using Python 3.10.12 environment at: /usr
Using Python 3.10.12 environment at: /usr
[PRE] ComfyUI-Manager
Checkpoint files will always be loaded safely.
Total VRAM 81838 MB, total RAM 2063756 MB
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : Total VRAM 81838 MB, total RAM 2063756 MB
pytorch version: 2.7.1
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : pytorch version: 2.7.1
Set vram state to: NORMAL_VRAM
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : Set vram state to: NORMAL_VRAM
Device: musa
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : Device: musa
Using async weight offloading with 2 streams
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : Using async weight offloading with 2 streams
Enabled pinned memory 1960568.0
2026-01-04 17:49:35 | model_management | 140030750700672 | INFO : Enabled pinned memory 1960568.0
Using pytorch attention
2026-01-04 17:49:36 | attention | 140030750700672 | INFO : Using pytorch attention
Python version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]
2026-01-04 17:49:37 | main | 140030750700672 | INFO : Python version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0]
ComfyUI version: 0.7.0
2026-01-04 17:49:37 | main | 140030750700672 | INFO : ComfyUI version: 0.7.0
ComfyUI frontend version: 1.35.9
2026-01-04 17:49:37 | frontend_management | 140030750700672 | INFO : ComfyUI frontend version: 1.35.9
[Prompt Server] web root: /usr/local/lib/python3.10/dist-packages/comfyui_frontend_package/static
2026-01-04 17:49:37 | server | 140030750700672 | INFO : [Prompt Server] web root: /usr/local/lib/python3.10/dist-packages/comfyui_frontend_package/static
[START] ComfyUI-Manager
2026-01-04 17:49:37 | __init__ | 140030750700672 | INFO : [START] ComfyUI-Manager
[ComfyUI-Manager] network_mode: public
2026-01-04 17:49:37 | manager_server | 140030750700672 | INFO : [ComfyUI-Manager] network_mode: public
Total VRAM 81838 MB, total RAM 2063756 MB
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : Total VRAM 81838 MB, total RAM 2063756 MB
pytorch version: 2.7.1
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : pytorch version: 2.7.1
Set vram state to: NORMAL_VRAM
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : Set vram state to: NORMAL_VRAM
Device: musa
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : Device: musa
Using async weight offloading with 2 streams
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : Using async weight offloading with 2 streams
Enabled pinned memory 1960568.0
2026-01-04 17:49:37 | model_management | 140030750700672 | INFO : Enabled pinned memory 1960568.0

Import times for custom nodes:
2026-01-04 17:49:38 | nodes | 140030750700672 | INFO : 
Import times for custom nodes:
   0.0 seconds: /ws/custom_nodes/websocket_image_save.py
2026-01-04 17:49:38 | nodes | 140030750700672 | INFO :    0.0 seconds: /ws/custom_nodes/websocket_image_save.py

2026-01-04 17:49:38 | nodes | 140030750700672 | INFO : 
Context impl SQLiteImpl.
2026-01-04 17:49:38 | migration | 140030750700672 | INFO : Context impl SQLiteImpl.
Will assume non-transactional DDL.
2026-01-04 17:49:38 | migration | 140030750700672 | INFO : Will assume non-transactional DDL.
No target revision found.
2026-01-04 17:49:38 | db | 140030750700672 | WARNING : No target revision found.
Starting server

2026-01-04 17:49:38 | server | 140030750700672 | INFO : Starting server

To see the GUI go to: http://127.0.0.1:8188
2026-01-04 17:49:38 | server | 140030750700672 | INFO : To see the GUI go to: http://127.0.0.1:8188
got prompt
2026-01-04 17:49:45 | server | 140030750700672 | INFO : got prompt
Using pytorch attention in VAE
2026-01-04 17:49:45 | model | 139999219230272 | INFO : Using pytorch attention in VAE
Using pytorch attention in VAE
2026-01-04 17:49:45 | model | 139999219230272 | INFO : Using pytorch attention in VAE
VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
2026-01-04 17:49:45 | sd | 139999219230272 | INFO : VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
Found quantization metadata version 1
2026-01-04 17:49:45 | utils | 139999219230272 | INFO : Found quantization metadata version 1
Using MixedPrecisionOps for text encoder
2026-01-04 17:49:45 | sd1_clip | 139999219230272 | INFO : Using MixedPrecisionOps for text encoder
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
2026-01-04 17:49:46 | sd | 139999219230272 | INFO : CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load WanTEModel
2026-01-04 17:49:46 | model_management | 139999219230272 | INFO : Requested to load WanTEModel
loaded completely; 80024.92 MB usable, 6419.49 MB loaded, full load: True
2026-01-04 17:49:51 | model_patcher | 139999219230272 | INFO : loaded completely; 80024.92 MB usable, 6419.49 MB loaded, full load: True
/ws/comfy/ops.py:34: UserWarning: Expected query, key and value to all be of dtype: {Half, BFloat16}. Got Query dtype: float, Key dtype: float, and Value dtype: float instead. (Triggered internally at /home/torch_musa/build/generated_cuda_compatible/include/ATen/native/transformers/sdp_utils_cpp.h:90.)
  return torch.nn.functional.scaled_dot_product_attention(q, k, v, *args, **kwargs)

2026-01-04 17:50:07 | warnings | 139999219230272 | WARNING : /ws/comfy/ops.py:34: UserWarning: Expected query, key and value to all be of dtype: {Half, BFloat16}. Got Query dtype: float, Key dtype: float, and Value dtype: float instead. (Triggered internally at /home/torch_musa/build/generated_cuda_compatible/include/ATen/native/transformers/sdp_utils_cpp.h:90.)
  return torch.nn.functional.scaled_dot_product_attention(q, k, v, *args, **kwargs)

model weight dtype torch.float16, manual cast: None
2026-01-04 17:50:07 | model_base | 139999219230272 | INFO : model weight dtype torch.float16, manual cast: None
model_type FLOW
2026-01-04 17:50:07 | model_base | 139999219230272 | INFO : model_type FLOW
Requested to load WAN21
2026-01-04 17:50:07 | model_management | 139999219230272 | INFO : Requested to load WAN21
loaded completely; 71799.64 MB usable, 2706.18 MB loaded, full load: True
2026-01-04 17:50:08 | model_patcher | 139999219230272 | INFO : loaded completely; 71799.64 MB usable, 2706.18 MB loaded, full load: True
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:28<00:00,  1.07it/s]
Requested to load WanVAE
2026-01-04 17:50:36 | model_management | 139999219230272 | INFO : Requested to load WanVAE
loaded completely; 64455.93 MB usable, 242.03 MB loaded, full load: True
2026-01-04 17:50:36 | model_patcher | 139999219230272 | INFO : loaded completely; 64455.93 MB usable, 242.03 MB loaded, full load: True
/ws/comfy/ops.py:34: UserWarning: Unsupported qk_head_dim: 384 v_head_dim: 384 for FlashAttention in MUSA backend (Triggered internally at /home/torch_musa/torch_musa/csrc/aten/ops/attention/mudnn/SDPUtils.h:129.)
  return torch.nn.functional.scaled_dot_product_attention(q, k, v, *args, **kwargs)

2026-01-04 17:50:36 | warnings | 139999219230272 | WARNING : /ws/comfy/ops.py:34: UserWarning: Unsupported qk_head_dim: 384 v_head_dim: 384 for FlashAttention in MUSA backend (Triggered internally at /home/torch_musa/torch_musa/csrc/aten/ops/attention/mudnn/SDPUtils.h:129.)
  return torch.nn.functional.scaled_dot_product_attention(q, k, v, *args, **kwargs)

Prompt executed in 55.36 seconds
2026-01-04 17:50:40 | main | 139999219230272 | INFO : Prompt executed in 55.36 seconds
FETCH ComfyRegistry Data [DONE]
2026-01-04 17:52:01 | cnr_utils | 139999742518848 | INFO : FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
2026-01-04 17:52:01 | manager_util | 139999742518848 | INFO : [ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
FETCH DATA from: /ws/user/__manager/cache/1514988643_custom-node-list.json [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
2026-01-04 17:52:02 | manager_server | 139999742518848 | INFO : [ComfyUI-Manager] All startup tasks have been completed.
got prompt
2026-01-04 17:52:27 | server | 140030750700672 | INFO : got prompt
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:27<00:00,  1.10it/s]
Prompt executed in 31.36 seconds
2026-01-04 17:52:58 | main | 139999219230272 | INFO : Prompt executed in 31.36 seconds

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn · 2026-01-05T07:44:56Z

@comfyanonymous Could you please take a look whenever you get a chance? Thanks.

Support MThreads (MUSA) GPU

f0caa15

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn requested review from Kosinkadink, comfyanonymous and guill as code owners January 4, 2026 10:01

yeahdongcn mentioned this pull request Jan 5, 2026

Update doc to add integrated projects MooreThreads/torchada#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support MThreads (MUSA) GPU #11618

Support MThreads (MUSA) GPU #11618

yeahdongcn commented Jan 4, 2026 •

edited

Loading

Uh oh!

yeahdongcn commented Jan 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Support MThreads (MUSA) GPU #11618

Are you sure you want to change the base?

Support MThreads (MUSA) GPU #11618

Conversation

yeahdongcn commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing Done

Uh oh!

yeahdongcn commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yeahdongcn commented Jan 4, 2026 •

edited

Loading

yeahdongcn commented Jan 5, 2026 •

edited

Loading