Piper First Word Getting Cut Off in Raw Mode for Multicast #623

ivonpint · 2024-10-07T16:44:05Z

ivonpint
Oct 7, 2024

Hi,

I’m currently using Piper to generate speech from text in raw mode, which is streamed over multicast. The setup works well except for one issue: the first word consistently gets cut off when I start a multicast session or when there is a pause before a list is read.

As a workaround, I have tried first generating the audio, then using FFmpeg to add a small pause before multicasting, which fixes the issue. However, this isn't ideal for my application, as I’d prefer a more seamless solution.

Here is a brief overview of what I’m trying to achieve:

Use Piper to generate TTS in raw mode.
Stream the audio over multicast without cutting off the first word.
I believe adding a buffer at the start of the raw output stream before sending it over multicast might solve this, but I’m unsure how to implement this effectively. I’ve attached my Python code for reference below.

Any insights or suggestions on how I can address this issue would be greatly appreciated.

Thanks in advance for your help!

code:
import subprocess
import shlex

def stream_speech_to_multicast(text, multicast_address):
# Escape the text input to avoid issues with special characters
escaped_text = shlex.quote(text)

# Piper TTS command to generate raw audio from the text
piper_cmd = f"echo {escaped_text} | /var/www/html/piper/piper --model /var/www/html/piper/models/en_US-hfc_female-medium.onnx --output-raw"

# FFmpeg command to process raw Piper output and stream to multicast
ffmpeg_cmd = f"ffmpeg -re -f s16le -ar 22050 -ac 1 -i - -filter_complex 'aresample=16000,asetnsamples=n=160' -acodec g722 -ac 1 -f rtp {multicast_address}"

try:
    # Pipe Piper output to FFmpeg for multicasting
    piper_process = subprocess.Popen(piper_cmd, shell=True, stdout=subprocess.PIPE)
    ffmpeg_process = subprocess.Popen(ffmpeg_cmd, shell=True, stdin=piper_process.stdout)

    # Ensure the piping works correctly
    piper_process.stdout.close()  # Close Piper's stdout to signal end of input to FFmpeg
    ffmpeg_process.wait()  # Wait for FFmpeg to finish

    return True
except subprocess.CalledProcessError as e:
    print(f"Error: {e.stderr.decode()}")
    return False

ivonpint · 2024-10-09T11:47:28Z

ivonpint
Oct 9, 2024
Author

I figured it out, I was able to put a delay of 500 ms on the audio device which synchronized the audio and all is well!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Piper First Word Getting Cut Off in Raw Mode for Multicast #623

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Piper First Word Getting Cut Off in Raw Mode for Multicast #623

ivonpint Oct 7, 2024

Replies: 1 comment

ivonpint Oct 9, 2024 Author

ivonpint
Oct 7, 2024

ivonpint
Oct 9, 2024
Author