How to stop the Text-To-Speech Azure SDK #2647

Roopesh-Bharatwaj-K-R · 2024-10-31T14:51:36Z

Hi Azure,

I have created a React app that utilises the Azure Speech (TTS) solution and, it's working fine.

I have used the Javascript and followed the JS Speech Synthesis documentation from Azure:

[https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis?pivots=programming-language-javascript&tabs=browserjs%2Cterminal#synthesize-speech-to-a-file]

But the issue popped when we try to stop the speaking, we could not stop the speech from the azure, and when i checked the code, there was no methods which calls direct stop, instead i could see, Close(), Cancel().

Which was not working fine and i did check the couple of issues related to the same. [SpeechSynthesizer.StopSpeakingAsync()] which was also not working fine.

I tried other approaches to do refreshing of the audio, and synthesis to Null. But it was not working fine

kindly suggest to me the best way how stop the audio from the azure TTS. Kindly share some of the Notebooks and code examples to resolve the stopping effectively.

Thanks in Advance for your kind suggestions.

Best,
Roopesh

The text was updated successfully, but these errors were encountered:

aman-vohra-007 · 2024-11-06T05:47:26Z

Hey, @Roopesh-Bharatwaj-K-R . Hope this helps

I have used the microsoft-cognitiveservices-speech-sdk for viseme so I have used ref in ReactJS for the synthesizer.

import * as sdk from "microsoft-cognitiveservices-speech-sdk"

const synthesizeSpeech = text => {
return new Promise((resolve, reject) => {
if (!speechSynthesizerRef.current) {
const speechConfig = sdk.SpeechConfig.fromSubscription(
import.meta.env.VITE_SPEECH_KEY,
import.meta.env.VITE_SPEECH_REGION
)
speechSynthesizerRef.current = new sdk.SpeechSynthesizer(speechConfig)
let speechStarted = false
.....
}

And to stop the speech, I did this
const stopSpeech = () => {
try {
setImageIndex(0)
setIsAudioPlaying(false)
if (speechSynthesizerRef.current) {
const audio =
speechSynthesizerRef.current.privAdapter?.privSessionAudioDestination?.privDestination
?.privAudio
if (audio) {
audio.pause()
audio.currentTime = 0
speechSynthesizerRef.current.close()
speechSynthesizerRef.current = null
}
}
} catch (e) {
console.error("Error in stopSpeech:", e)
}
}

This helped in stopping the speech as well as resetting the synthesis, so if you play it again, the audio starts too.

Roopesh-Bharatwaj-K-R · 2024-11-06T10:29:51Z

Hi @aman-vohra-007 Thanks a lot for taking the time to respond to me on this issue, I will check on this code and will try it. Also could you please share the documentation for the same, it will be useful for me to have a readout and recording purpose for other readers facing the same issue.

aman-vohra-007 · 2024-11-06T10:49:04Z

Hey, @Roopesh-Bharatwaj-K-R ,
So, as I said I have used the Microsoft SDK and I checked their GitHub and documentation for the same speech-stopping functionality. But it turns out that there is no such thing made yet.

Their Github: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/javascript/browser/translate-speech-to-text/index.html

Their Doc: https://learn.microsoft.com/en-gb/azure/ai-services/speech-service/speech-synthesis-markup-voice

Hence, when I was working on my code I knew that the speech was made by ssml in my case And there was no such function as StopSpeakingAsync() as well as even if it worked, It kept going on until the end of the sentence.

So I consoled my ref of the synthesizer and found out the audio made at this spot
const audio = speechSynthesizerRef.current.privAdapter?.privSessionAudioDestination?.privDestination?.privAudio

So, I used it to stop the audio instantly, reset the audio, and also close the speechSynthesizerRef.
By doing this, I was able to stop the audio as well as reset it by synthesizer for the next input usage.

I didn't find any form of documentation for this and was stuck for days but when I solved it, I thought of sharing the solution for others who are stuck doing the same thing.

Hope this helps.

Thank you,
Aman Vohra

Roopesh-Bharatwaj-K-R · 2024-11-06T11:16:12Z

Hi @aman-vohra-007

Thanks a lot for sharing your code and docs, I also used a similar approach by calling private property audio Object.

`
// Function to stop audio playback
const stopAudioPlayback = (synthesizer) => {

const audio = synthesizer.privAdapter?.privSessionAudioDestination?.privDestination?.privAudio;
if (audio) {
audio.pause();
audio.currentTime = 0;
console.log("Audio playback stopped.");
} else {
console.warn("Audio element not found. Playback may not be stopped.");
}
};
`

One disadvantage of both the suggested approaches; is whenever they update the SDK we may not be able to do the same.

Best,
Rooepsh

varuntayur · 2024-11-07T10:54:52Z

Hi @aman-vohra-007

Thanks a lot for sharing your code and docs, I also used a similar approach by calling private property audio Object.

` // Function to stop audio playback const stopAudioPlayback = (synthesizer) => {

const audio = synthesizer.privAdapter?.privSessionAudioDestination?.privDestination?.privAudio; if (audio) { audio.pause(); audio.currentTime = 0; console.log("Audio playback stopped."); } else { console.warn("Audio element not found. Playback may not be stopped."); } }; `

One disadvantage of both the suggested approaches; is whenever they update the SDK we may not be able to do the same.

Best, Rooepsh

Thanks for sharing the code, the pause doesn't work for me, it never pauses the audio. It goes on until the playback is complete. When you start speaking while the playback is happening, it starts playing out the new speech.

when the code detects user speaking, i try to stop the playback...

recognizer.recognizing = (s, e) => {
                            console.log(`RECOGNIZING: Text=${e.result.text}`);

                            stopAudioPlayback(synthesizer);
   };

...

recognizer.recognized = (s, e) => {

       synthesizer.speakTextAsync(e.result.text,
                                    function (result) {
                                        if (result.reason === SpeechSDK.ResultReason.SynthesizingAudioCompleted) {
                                            console.log("synthesis finished: " + result.audioData.byteLength + " bytes");
                                            player.onAudioEnd = () => {
                                                console.log("Finished speaking");
                                            };
                                        }
                                    },
                                    function (err) {
                                        console.trace("err - " + err);
                                        synthesizer.close();
                                    });

}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to stop the Text-To-Speech Azure SDK #2647

How to stop the Text-To-Speech Azure SDK #2647

Roopesh-Bharatwaj-K-R commented Oct 31, 2024

aman-vohra-007 commented Nov 6, 2024 •

edited

Loading

Roopesh-Bharatwaj-K-R commented Nov 6, 2024

aman-vohra-007 commented Nov 6, 2024

Roopesh-Bharatwaj-K-R commented Nov 6, 2024 •

edited

Loading

varuntayur commented Nov 7, 2024

How to stop the Text-To-Speech Azure SDK #2647

How to stop the Text-To-Speech Azure SDK #2647

Comments

Roopesh-Bharatwaj-K-R commented Oct 31, 2024

aman-vohra-007 commented Nov 6, 2024 • edited Loading

Roopesh-Bharatwaj-K-R commented Nov 6, 2024

aman-vohra-007 commented Nov 6, 2024

Roopesh-Bharatwaj-K-R commented Nov 6, 2024 • edited Loading

varuntayur commented Nov 7, 2024

aman-vohra-007 commented Nov 6, 2024 •

edited

Loading

Roopesh-Bharatwaj-K-R commented Nov 6, 2024 •

edited

Loading