Skip to content

在Windows本地部署,运行时报错 RuntimeError: Tensors must have same number of dimensions: got 2 and 1 #13

@lmdown

Description

@lmdown

Web页面已经启动了,并且通过麦克风说话,在output/input目录下也生成了说话的录音文件。但后续的步骤出错,无法生成回复的音频。

以下是详细日志
[INFO] Received audio data: 116 bytes
[INFO] Accumulating PCM: 960 samples (total: 476160, 19.84s)
[INFO] Received audio data: 152 bytes
[INFO] Accumulating PCM: 960 samples (total: 477120, 19.88s)
[INFO] Received audio data: 154 bytes
[INFO] Accumulating PCM: 960 samples (total: 478080, 19.92s)
[INFO] Received PAUSE signal
[INFO] Saved audio to ./output\input\Client-2944189797568_turn0_input.wav, length: 19.92s
[INFO] [Turn 0] Preparing model input: 2 messages, 1 audio items
[INFO] [Turn 0] Message history: ['system', 'user']
[INFO] [Turn 0] Queue status: audio_tokens=0, opus_bytes=0
[INFO] start inference of (19.92s audio)...
[INFO] [Turn 0] Generation thread started, input_ids shape: torch.Size([1, 219])
[INFO] Start monitoring the generation results loop (turn 0)
[INFO] Monitor loop #2200: thread_alive=True, steps=1, last_step=1, stuck_time=120.8s
[INFO] Skipping first audio batch (prompt): shape=torch.Size([1, 219]), tokens=219
[INFO] Monitor loop #100: thread_alive=True, steps=1, last_step=1, stuck_time=4.7s
[INFO] Monitor loop #2300: thread_alive=True, steps=1, last_step=1, stuck_time=125.6s
[INFO] Monitor loop #200: thread_alive=True, steps=1, last_step=1, stuck_time=9.4s
[INFO] Monitor loop #2400: thread_alive=True, steps=1, last_step=1, stuck_time=130.3s
[INFO] Monitor loop #300: thread_alive=True, steps=1, last_step=1, stuck_time=14.1s
[INFO] Monitor loop #2500: thread_alive=True, steps=1, last_step=1, stuck_time=135.1s
[INFO] Monitor loop #400: thread_alive=True, steps=1, last_step=1, stuck_time=18.9s
[INFO] Monitor loop #2600: thread_alive=True, steps=1, last_step=1, stuck_time=139.9s
[INFO] Monitor loop #500: thread_alive=True, steps=1, last_step=1, stuck_time=23.6s
[INFO] Monitor loop #2700: thread_alive=True, steps=1, last_step=1, stuck_time=144.6s
[INFO] Monitor loop #600: thread_alive=True, steps=1, last_step=1, stuck_time=28.4s
[INFO] Monitor loop #2800: thread_alive=True, steps=1, last_step=1, stuck_time=149.3s
[WARNING] [Turn 0] Generation appears stuck: no new steps for 10s (steps=1, loop=635)
[INFO] connection closed
[INFO] Send coroutine stopped, total frames sent: 0
[INFO] Text send coroutine stopped, total texts sent: 0
[INFO] Encode thread stopped
[INFO] TTS receiver thread stopped
[INFO] TTS sender thread stopped
[INFO] Monitor loop #700: thread_alive=True, steps=1, last_step=1, stuck_time=34.1s
[INFO] Monitor loop #2900: thread_alive=True, steps=1, last_step=1, stuck_time=155.1s
[INFO] Monitor loop #800: thread_alive=True, steps=1, last_step=1, stuck_time=40.6s
[INFO] Monitor loop #3000: thread_alive=True, steps=1, last_step=1, stuck_time=161.5s
[INFO] Monitor loop #900: thread_alive=True, steps=1, last_step=1, stuck_time=47.0s
[INFO] Monitor loop #3100: thread_alive=True, steps=1, last_step=1, stuck_time=168.0s
[INFO] Monitor loop #1000: thread_alive=True, steps=1, last_step=1, stuck_time=53.4s
[INFO] Monitor loop #3200: thread_alive=True, steps=1, last_step=1, stuck_time=174.4s
[INFO] Monitor loop #1100: thread_alive=True, steps=1, last_step=1, stuck_time=59.8s
[INFO] Monitor loop #3300: thread_alive=True, steps=1, last_step=1, stuck_time=180.8s
[INFO] Monitor loop #1200: thread_alive=True, steps=1, last_step=1, stuck_time=66.2s
[INFO] Monitor loop #3400: thread_alive=True, steps=1, last_step=1, stuck_time=187.2s
[ERROR] [Turn 0] Generation failed after 71.79s: Tensors must have same number of dimensions: got 2 and 1
Traceback (most recent call last):
File "D:\apps\Fun-Audio-Chat\web_demo\server\server.py", line 447, in run_generation
self.model_manager.model.generate(**inputs, **gen_kwargs_with_streamer)
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\torch\utils_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\transformers\generation\utils.py", line 2597, in generate
result = self._sample(
^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\funaudiochat\modeling_funaudiochat.py", line 1329, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\funaudiochat\modeling_funaudiochat.py", line 1133, in forward
speech_output = self.audio_invert_tower(
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\venv\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\funaudiochat\modeling_funaudiochat.py", line 663, in crq_generate_forward
crq_audio_tokens, logits = self.sampling_step(logits)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\apps\Fun-Audio-Chat\funaudiochat\modeling_funaudiochat.py", line 603, in sampling_step
next_token_scores = self.crq_logits_processor(torch.cat([self.crq_speech_ids, *self.crq_generate_tokens], dim=-1), next_token_logits)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Tensors must have same number of dimensions: got 2 and 1
[INFO] Monitor loop ended: total_loops=1286, final_steps=1, interrupted=False
[INFO] TTS generation marked as complete
[INFO] All audio encoded and sent after 0.0s
[INFO] Generation completed:
[INFO] Processing completed
[INFO] Waiting for workers to stop...
[INFO] Cleared TTS cache for session 6586ef80-a484-4ae9-ab79-807923f2d875
[INFO] All worker threads stopped
[INFO] done with connection
2025-12-25 11:57:42,482 INFO 192.168.6.3 [25/Dec/2025:11:56:08 +0800] "GET /api/chat?text_temperature=0.7&text_topk=25&audio_temperature=0.8&audio_topk=250&pad_mult=0&text_seed=261668&audio_seed=665770&repetition_penalty_context=64&repetition_penalty=1 HTTP/1.1" 101 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:146.0) Gecko/20100101 Firefox/146.0"
[INFO] [TTS Process] Cleared cache for 6586ef80-a484-4ae9-ab79-807923f2d875

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions